Professor Nicholas Petraco’s “Quantitative Criminalistics”

April 25, 2014 –Nicholas Petraco, Associate Professor at John Jay College of Criminal Justice, CUNY, presented his talk, “Quantitative Criminalistics: Application of State-of-the-Art Machine Learning and Computational Statistics to Catching Bad Gnickp_pic2uys.”

Trained in applied mathematics and quantum chemistry, Professor Petraco’s research focuses on the application of rigorous science to the law and litigation. His research has been profiled in the New York Times, Popular Mechanics, C&E News, and The New Scientist.

In his presentation, Professor Petraco reviewed how data mining can be used in applied forensics. He shows how the combination of Bayesian and non-Bayesian techniques can offer data miners new ways to measure performance of data mining models, and how these measurements are essential in their application in law and court. Finally, he shows examples from his own research including the analysis of fingerprints, shoe prints, and chemicals.

If you missed the seminar, you can still catch it here:


Introduction to Directed Acyclic Graphs

March 21, 2014 – Dave Monaghan, Darren Kwong, and Dirk Witteveen, graduate candidates at the Graduate Center, CUNY, presented their seminar, “Advances in Quantitative Methods: An Introduction to Directed Acyclic Graphs.”

In their presentation, they reviewed some of the assumptions that plague the counterfactual model and showed how recent research on directed acyclic graphs (DAGs) can help with problems of identification of causal effects between variables. Ultimately, DAGs demonstrate that researchers need to be careful of what variables they include in their analysis and provide researchers a more rigorous approach to building their analytical models.

If you missed this seminar, you can catch it here:

Dr. Graham Williams’ Workshop “Excavating Knowledge from Data”

March 7, 2014 – Graham Williams, Data Scientist at the Australian Taxation Office and Adjunct Professor at the Australian National University, led a workshop titled, “Excavating Knowledge from Data: Introducing Data Mining Using R.” Dr. Williams is the author of the Rattle software for data mining and of the Rattle book, Data Mining with Rattle and R: The Art of Excavating Knowledge from DataGraham_Williams_pic

During the workshop, Dr. Williams shared insight about the many uses of data mining. He provided live tutorials of how to prepare data and run models using R and, more specifically, Rattle. This workshop is a great introduction on data mining for anybody interested in learning R coding.

If you missed Dr. Williams’ workshop, you can watch the recording of it by clicking on the links below. Additionally, handouts of the workshop are also provided. We ask if you find the workshop or the handouts useful to your work that you properly cite his recording or materials, which he has graciously allowed us to share on our website. If you would like more information about Dr. Williams and his work, be sure to check out his website:

Part 1 of 4 of workshop:

Part 2 of 4 of workshop:

Part 3 of 4 of workshop:

Part 4 of 4 of workshop:

Handouts: Handouts for Workshop on Rattle and R

Professor Paul Attewell’s “How much social structure is there?”

February 21, 2014 – Paul Attewell, Distinguished Professor of Sociology at the Graduate Center, CUNY, presented his talk, “How much social structure is there? The challenge presented by Data Mining.” Professor Attewell’s recent research focuses on non-elite college students and the barriers that they face and is funded by the Gates Foundatpaul_attewell_picion. He is also PI on the NSF-funded “CUNY Data Mining Initiative.”

In his presentation, Professor Attewell discussed how data mining methods can benefit social science research. He argued that while traditional statistical methods still leave substantial unexplained variance, data mining methods have higher accuracy and prediction rates. How much data mining methods improve on sociological methods? Can data mining uncover hidden structure that traditional statistical techniques could not? These are questions that Professor Attewell raised.

If you were unable to attend the event, you can still catch the seminar here:



Professor Hanghang Tong’s “Optimal Dissemination on Graphs: Theory and Algorithms”

November 22, 2013 – Hanghang Tong, an Assistant Professor of Computer Science at the City College of the City University of New York, presented his talk, “OptimHanghang Tong - pical Dissemination on Graphs: Theory and Algorithms.” Professor Tong’s research is focused on large scale data mining for graphs and multimedia and has received multiple rewards including best paper award in CIKM 2012 and has published 70 referred articles and more than 20 patents.

In the seminar, Professor Tong talked about his research on graph mining. The techniques discussed are pertinent to studies of networks, such as how rumors spread or how viral marketing expands. Professor Tong offers some insights on how these graphs can be mapped and the math behind it.

If you missed the seminar or would like to review it, you can do so here:


Alexis Gabadinho’s “Workshop on Sequence Analysis and TraMineR

October 11, 2013 – Two months ago, Mr. Alexis Gabadinho, a scientific collaborator at the Institute for Demographic and Life Course Studies at the University of Geneva. We are excited to provide a recorded video of the workshop here on our website.gabadinho_pic

At the workshop, Mr. Gabadinho introduced TraMineR, a downloadable software package in R. Mr. Gabadinho is one of the creators of this program. He demonstrated the versatility of TraMineR  as a software and introduced the use of sequence analysis as a method of analyzing longitudinal data.

The video to this workshop is split into two parts. You can find part one of Sequence Analysis workshop here:

Part two of Sequence Analysis workshop can be watched here:

Additionally, Mr. Gabadinho has generously shared his handouts with us. You can download them here:  As a friendly reminder, be sure to use the proper citations when referencing Mr. Gabadinho’s presentation and handouts. You can download the handouts here: Handouts for Workshop on Sequence Analysis and TramineR

Professor Robert Stine’s Seminar: “Featurizing Text”

October 25, 2013 – Robert Stine, Professor of Statistics at the Wharton School of the University of Pennsylvania, presented his seminar, “Featurizing Text: Converting Text into Predictors for Regression Analysis.” Professor Stine has appeared in numerous journals, including the Journal of the American Statistical Association, Journal of the Royal Statistical Society, and the Annals of Statistics.

In his presentation, Professor Stine introduces three ways of converting text into numerical values. He guides the audience through the methods of counting words, principal components analysis of word counts, and the forming of eigenwords from sequences of words. These new predictors can be then used to build regression models.

Be sure to watch the seminar if you were unable to attend the seminar. You can find Professor Stine’s seminar here:

Want to learn more about Professor Stine’s work, read his draft manuscripts, or look through his presentation slides? You can find them on his website here: