Favorite quote:

"If we knew what it was we were doing, it would not be called research, would it?" –Albert Einstein

Monday 18 January 2010

List of papers related to (MOD) moving object databases-Indexing, Mining, and Similarity Search.

1. 2009,EDBT,Continuous Probabilistic Nearest Neighbor Queries for uncertain trajectories.

2. 2009,EDBT,Fast Object Search on Road Networks.

3. 2009,SIGMOD,Continuous Obstructed Nearest Neighbor Queries.

4. 2009, ICDE, Temporal Outlier Detection in Vehicle Traffic Data.

5. 2009,EDBT,Processing Probabilistic Spatio-Temporal Range Queries  over Moving Objects with Uncertainty.

6. 2009,ACM, Similarity measures for trajectory of moving objects in cellular space.

7. 2009,EDBT,Tight results for clustering and summarizing data streams.

8. 2009,EDBT,Efficient Constraint Evaluation in Categorical Sequential  Pattern Mining for Trajectory Databases.

9. 2009,SSDBM,Finding Structural Similarity in Time Series Data Using Bag of Patterns Representation.

10. 2009,EDBT,Sequenced Spatio-Temporal Aggregation In Road Networks.

11. 2009,SSDBM,Probabilistic Similarity Search for Uncertain Time Series.

12. 2008,SSDM,Mining Temporal Association Patterns under a Similarity Constraint.

13. 2008, International Symposium on Algorithms and Computation , Detecting Commuting Patterns by Clustering Sub trajectories.

14. 2008,SIGMOD , Scalable Network Distance Browsing in Spatial Databases.

15. 2008,IEEE,Efficient Similarity Search over Future Stream Time Series.

16. 2008,IEEE, Bag of Segments for Motion Trajectory Analysis.

17. 2008,SSDM,Scalable Ubiquitous Data Access in Clustered Sensor Networks.

18. 2008,SSDM,Mining Temporal Association Patterns under a Similarity Constraint.

19. 2008,PODS, Approximation Algorithms for Clustering Uncertain Data.

20. 2008, SCI, Measuring Similarity Between Trajectories of Mobile Objects

21. 2007,SIGMOD,Adaptive Location Constraint Processing.

22. 2007,IEEE, Continuous Clustering of Moving Objects.

23. 2007,ICDE, Index-based Most Similar Trajectory Search.

24. 2007, SIGMOD , Trajectory Clustering A Partition and Group Framework.

25. 2007,TIME, Similarity Search in Trajectory Databases.

26. 2007,SIGMOD,An Efficient and Accurate Method for Evaluating Time Series Similarity.

27. 2007,MLDM,Mining Frequent Trajectories of Moving Objects for Location Prediction.

28. 2007, DASAFAA, Clustering Moving Objects in Spatial Networks.

29. 2007,MDM, Moppy – Mobile Object Position Prediction.

30. 2007,GI-Days, Trajectory Similarity of Moving Objects.

31. 2006,SSDM,Sampling Trajectory Streams with Spatiotemporal Criteria.

32. 2006,SSDM,An Extensible Infrastructure for Processing Distributed Geospatial Data Streams.

33. 2006,SSDM,Time Series Analysis Using the Concept of Adaptable Threshold Similarity.

34. 2006,ICDM, Discovery of Collocation Episodes in Spatiotemporal Data.

35. 2006,ICPR, Comparison of Similarity measures for Trajectory Clustering in Outdoor.

36. 2006,ACM,Global Distance Based Segmentation of Trajectories.

37. 2006,ACM, On Mining Moving Patterns for Object Tracking Sensor Networks.

38. 2006,ACM, Time-focused clustering of trajectories of moving objects.

39. 2006,IEEE, Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes.

40. 2006, Intelligent Control and Automation Book, Future Location Prediction of Moving Objects Based on Movement Rules.

41. 2005,ICIP, Similarity based vehicle trajectory clustering.

42. 2004,ACM,Clustering Objects on a Spatial Network.

43. 2004,IDEAS, Modeling, Storing and Mining Moving Object Databases.

44. 2004,KDD,Clustering Moving Objects.

45. 2004,KDD,Clustering Moving Objects on Spatial Network.

46. 2004,ICPR, Multi Feature Path Modeling for Video Surveillance.

47. 2004,ADC, Clustering Moving Objects for Spatio-temporal Selectivity Estimation.

48. 2004,Journal of Systems and Software Temporal moving pattern mining for location based service.

49. 2003,CIVR, Efficient Similar Trajectory- Based Retrieval For Moving Objects In Video Database.

50. 2003,VLDB, Indexing the Positions of Continuously Moving Objects.

51. 2002,ICDE, Discovering Similar Multidimensional Trajectories.

52. 2000,ACM, Indexing the Positions of Continuously Moving Objects.

Saturday 17 October 2009

Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation.

Time series data can be represented as a bag of segments like  bag of words representation for text data.

· What is bag of words approach?

Wikipedia

The BoW in NLP is a popular method for representing documents, which ignores the word orders. For example, "a good book" and "book good a" are the same under this model. The BoW representation serves as the basic element for further processing, such as object categorization.

· There are two kinds of similarities:

o Shape-based similarity:

§ Determines the similarity of two datasets by comparing their local patterns.

§ Work well with short time series.

o Structure-based similarity: determines the similarity by comparing their global structures.

· Why we need Structure-based similarity approach?

Because most existing approaches focus on finding shape-based similarity that work well with short time series and fail to produce satisfactory results with long sequences.

o Example for textual data: To compare two strings, we can use the string edit distance to compute their similarity. However, if we want to compare two documents, we use a higher-level representation that can capture the structure or semantic of the document

· Time Series: A time series T = t1,…,tp is an ordered set of p real-valued variables.

· Challenges of algorithm:

o The definition and construction of the patterns “vocabulary.”

o Time series data are composed of consecutive data points

· Algorithm:

o Step1: Find Pattern for each time series using PAA.

§ Piecewise Aggregate Approximation (PAA) transform a time series representation into a user defined number of equal segments which is the mean of the data points in that segment. The PAA values are then transformed into symbols by using a breakpoint table.

o Step2: Construct the Bag Of Patterns (BOP) matrix given 2 parameters (α:Size of alphabet ;ω: Size of patterns produced).

BOP Matrix (Time Series, Pattern) and Mij is the frequency of pattern i in sequence j.

 

1

2

3

.

.

aaa

10

2

3

.

.

aab

0

0

8

.

.

aac

 

.

.

.

.

.

.

.

.

.

.

..

.

.

.

.

.

· Empirical Evaluation:

1. Clustering

o Hierarchical Clustering: Compare BOP with CDM, Euclidean distance on raw time series, Dynamic Time Warping on raw time series, and Euclidean distance on DFT coefficients.

§ Result of experiment on a small dataset: BOP can find clusters while shape-based approaches cannot, with only six datasets.

§ Result of experiment on a large dataset: On 13 pairs of Dataset

· DFT and Euclidean distance: only 3 pairs are successfully clustered

· DTW: successfully clusters 10 pairs of dataset. However, its prohibitive time complexity makes it an unrealistic choice of distance measure for large datasets.

· CDM and BOP: All 13 pairs are successfully clustered.

o Advantages of BOP over CDM

1. BOP shows distribution of patterns from histograms, to help understanding the underlying structures of the data.

2. BOP build the final representation from subsequences, so it is suitable for streaming data

o Partitional Clustering: Performing k-means using the Euclidean distance on the raw data, and on our bag-of-patterns representation. Using an evaluation method compares the similarity between two sets of cluster labels, and returns a number between 0 and 1 denoting how similar they are. Their approach achieves the best clustering quality (0.7133 vs. 0.4644).

2. Classification: Using nearest neighbor classification algorithm. Through studying the accuracy as the ratio of count the number of correctly classified objects and the total number of objects and result was 0.996 means that there is only 1 misclassified object, out of 250 objects.

Source:Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation
Lin, J. and Li, Y. 2009. Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation. In Proceedings of the 21st international Conference on Scientific and Statistical Database Management (New Orleans, LA, USA, June 02 - 04, 2009). M. Winslett, Ed. Lecture Notes In Computer Science, vol. 5566. Springer-Verlag, Berlin, Heidelberg, 461-477. DOI= http://dx.doi.org/10.1007/978-3-642-02279-1_33.
 

Thursday 8 October 2009

My First Blog Post.

Welcome to my first blog post. I have been planning to start blogging for a long time, but I postponed it because I'm not really very good about this whole blogging things. Nevertheless, today I decided to start my own blog to share knowledge with you, specially research papers that I read. I will always try to write up on things that are of some interest to you. Meanwhile, congratulate me for making my first blog post. Finally I want to say “THANKS” to GOD then to all people around me who always support and encourage me. In addition to special “THANKS” to my friend and teammate “Ebeid”, he has helped me to choose my blog name. If you can’t choose your blog name ask him :)