International Journal of Computer Engineering and Applications, Volume XI, Issue IX, September 17, ISSN

CRIME ASSOCIATION ALGORITHM USING AGGLOMERATIVE CLUSTERING Saritha Quinn 1, Vishnu Prasad 2 1 Student, 2 Student Department of MCA, Kristu Jayanti College, Bengaluru, India ABSTRACT The commission of a criminal act revolves around human tendencies. To find a correlation between two similar crimes we need a more dynamic model that doesn t randomly take Modal attribute values merely clustering the behavior as normal or abnormal. The existing models may prove inefficient as humans are heuristic learners. Instead, a behavior model can be framed through factor analysis, that can help in predicting the correlation between offenders and crimes. In this paper, we gauged the tendencies based on certain characteristics and prepared a model with probabilistic predictions. The models can be then used as Classifiers in decision trees and predict futuristic events. Behavioral Patterns are identified from the probabilistic model and we can predict the prospective crime and must analyze behavior patterns to identify suspects without much time and efforts. A bottom-up approach is used in the process as it delivers more precise results. Keywords: Crime association, Crime analysis, Factor analysis; I. INTRODUCTION Data mining is the extraction of hidden models and patterns by examining large pre-existing databases. Relevant and contributing elements are selected using dynamic detection techniques [10]. 76

CRIME ASSOCIATION ALGORITHM USING AGGLOMERATIVE CLUSTERING Crime analytics is analysis of data directed towards evaluation of long-term Strategies and prevention techniques to identify general and specific crime trends, patterns and series in an ongoing, timely manner, to maximize the usage of limited law enforcement resources, to access crime problems locally, regionally, nationally within and between law enforcement agencies, to be proactive in detecting and preventing crimes and to meet the law enforcement needs. Classification is a technique for modeling of forecasts. It is the process of dividing the data by class labels that can act dependently or independently to give relevant results. Crime analysis in general is usually performed in one of the four following ways namely: Linkage analysis, Statistical Analysis, Profiling and Spatial analysis. The proposed method is a combinatorial implementation of Linkage and statistical analysis [9]. Factor analysis is a statistical method used to describe variability among Correlated variables in terms of factors. There holds significance for crime analysis. It is an investigative tool used by law enforcement Agencies to identify prospective suspects and analyze patterns that may Predict future offenses. Correlated variables are here referred to as factors. They represent dimensions within data. Agglomerative Hierarchical Clustering is used to realize the underlying factors. Elements of the behavior demonstrate a unique expression. This bottom-up method starts with each example in its own cluster and iteratively combine them to form larger and larger clusters. Divisive (partitional, top-down) separate immediately into clusters. Here, a bottom up approach is preferred because data is obtained in the form of tiny clusters of information. This information needs to be labelled under a higher concept hierarchy. The offender carries out actions, which goes beyond the scope of criminal activity and looks like an uncontrollable tendency. When those tendencies are observed at the crime scene, the offender leaves the calling card. [12] Crime intelligence is the analysis of people involved in crimes, particularly repeat offenders and repeat victims. Tactical crime analysis is the analysis of police data directed towards the short-term development of patrol and investigative priorities and deployment of resources. Strategic crime analysis is directed towards development and evaluation of long-term strategies and prevention techniques. There are three phases of a crime: (i) Pre-Crime Analysis, (ii) Identification and tracking and (iii) Post-Crime Analysis. II. RELATED WORKS Arnab Samanta [1] describes a model which is divided into two functions which involves the seriousness of the crime and the psychology of the criminal. The model classifies the crime into three categories felony, violation, and misdemeanor. 77

The second function involves determining the psychology of the criminal, and this function uses clustering. The result is then used to generate a binary code and the resulting code is analyzed. They use classification to determine the type of crime. With that police can take necessary actions to reduce the crime rate. Crimes can be divided into subcategories based on different criteria. In [2] eight crime categories are given. They are violations such as, sex crimes, theft, fraud, arson, drug offenses, cybercrimes and violent crimes. They have given definitions for each category in local law enforcement level and national law enforcement level. IPTC [8] (international press telecommunication council) too has given a different categorization where crimes are divided into war crime, corporate crime, organized crime etc. There were many efforts to analyze different types of crimes using automated techniques but there is no unified framework describing how to apply those techniques to different crime types. In [2], they have proposed a framework which includes a relationship between the crime data mining technique and crime type characteristics. An intelligent crime identification system is described in [6] which can be used to predict possible suspects for given crime. They have used five types of agents namely, message space agent, gateway agent, prisoner agent, criminal agent and evidence agent. Hierarchical Agglomerative Clustering is a similarity based bottom-up clustering technique in which at the beginning every term forms a cluster of its own. Then the algorithm iterates over the step that merges the two most similar clusters still available, until one arrives at a universal cluster that contains all the terms [8]. Another interesting approach is the incremental conceptual clustering presented in [9] which is based on a category utility as a quality measure to be maximized. III. PROPOSED SYSTEM The segmentation process [10] consists of the following phases: Morphological analysis, Hierarchical agglomerative clustering and Boundary detection. We describe the linguistic context of a term by the syntactic dependencies that it establishes as the head of a subject or of a PP-complement with a verb. Then, we represent a term by its context, and count the frequency of syntactically dominating verbs. The heads and concepts determine tendencies that help in associating crimes with suspects. 78

CRIME ASSOCIATION ALGORITHM USING AGGLOMERATIVE CLUSTERING METHODOLOGY Algorithm: 1. Identify the type of offense 2. Gather historic data of similar offenses using K-Means or a similar algorithm 3. Analyze the tendencies into factors/general behavior 4. Profile them into their respective categories 5. Observe if there is the same pattern and predicted heuristic value. 6. If yes correlation 7. Else 8. No correlation 9. If a prospective crime is being indicated with a high score look at the remedial measures 10. End Technique Wherein C1,C2, represents Clues and T1,T2, represents tendencies. The tendencies T1,T2, represent the associations between offences and offenders. C1 C2 C3 C4 T1 T2 79

IV. CONCLUSION AND FUTURE ENHANCEMENTS: However strong the proposed model is, it cannot completely replace human intervention. It can act as an investigative advisory in Crime prevention. Since even minimalistic errors could be fatal, the above model could only reduce the number of manpower required to handle the modelling and profiling. Reinforcement learning can be applied to enhance the efficiency of the models and class labels. The precision of this model is considered reliable as it does not stick to any behavior pattern recognition but, classifies behavior iteratively and then gives the probability of an occurrence of an event. In our future efforts to implement this we can consider more dynamic filter methodologies. The proposal focuses on how the probabilistic model improves precision and efficiency than the traditional approach. REFERENCES [1] Crime Classification and Criminal Psychology Analysis using Data Mining Arnab Samanta, Amol [2] H. Chen, W. Chung, J. Xu, G. Wang, Y. Qin and M. Chau, Crime data mining: a general framework and some examples, IEEE Explore- [3] [8] International Press Telecommunications Council [Online]. Available: http://www.iptc.org/site/home/ [4] [3] Crime Mapping and Reporting System [Online]. Available: https://www.crimereports.com/ [5] [4] Intelligent Mapping System [Online]. Available: http://maps.met.police.uk/ [6] [5] R. Krishnamurthy and S. Kumar, Survey of data mining techniques on [7] crime data analysis, International Journal of Data Mining Techniques and Applications, vol. 1, no. 2, pp. 117-120, December 2012. Computer, vol. 37, no. 4, pp. 50-56, 2004 [8] Comparing Conceptual, Divisive and Agglomerative Clustering for Learning Taxonomies from Text Philipp Cimiano, Andreas Hotho and Steffen Staab [9] D. Fisher, Knowledge acquisition via incremental conceptual clustering, Machine Learning, (2), 139 172, (1987). [10] Data Mining and Statistics: What s the connection? Jerome H Freidman [11] Available: http://www.e-criminalpsychology.com/criminal-behavior-analysis/ [12] Available: http://analyticsindiamag.com/crime-analytics/ 80