About
I'm a staff machine learning engineer at Lightning.ai, working with the sales team to…
Activity
-
Hey there! We’re expanding our team at neptune.ai and looking for a Principal Solutions Architect https://lnkd.in/ejutQjka. The role is 100%…
Hey there! We’re expanding our team at neptune.ai and looking for a Principal Solutions Architect https://lnkd.in/ejutQjka. The role is 100%…
Liked by Craig Pfeifer
Experience
Education
-
University of Maryland Baltimore County
-
coursework in machine learning, neural networks
research in domain adaptation in natural language processing -
-
thesis in natural language processing: Using Word Sense Generation to Generate a Semantic Lexicon
coursework includes AI, Natural Language Processing, Knowledge Representation and Reasoning as well as old favorites Algorithms, Computer Architecture, and Operating Systems. -
-
Activities and Societies: Association For Computing Machinery, Science Student Council, Purdue All American Marching Band, Residence Hall Counselor, University Orchestra
Undergraduate Research in mobile computing, data structures. Double minor in Creative Writing and Technical Writing.
Licenses & Certifications
-
Certified Enterprise Architect for J2EE
Sun Microsystems
Publications
-
Does Lawyering Matter? Predicting Judicial Decisions from Legal Briefs, and What That Means for Access to Justice
Texas Law Review
This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal…
This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal research. However, good legal research is expensive, and the primacy of citations in our models raises concerns about access to justice. Here, our citation-based models also suggest promising solutions. We propose a freely available, computationally enabled citation identification and brief bank tool, which would extend to all litigants the benefits of good lawyering and open up access to justice.
Other authorsSee publication -
Semi-Supervised Methods for Explainable Legal Prediction
ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law
Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The…
Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The first approach, which uses an Attention Network for prediction and attention weights to highlight salient case text, was shown to be capable of predicting decisions, but attention-weight-based text highlighting did not demonstrably improve human decision speed or accuracy in an evaluation with 61 human subjects. The second approach, termed SCALE (Semi-supervised Case Annotation for Legal Explanations), exploits structural and semantic regularities in case corpora to identify textual patterns that have both predictable relationships to case decisions and explanatory value.
-
ADEPT: Automated Directive Extraction from Policy Texts
ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law
ADEPT illustrates how a document analysis task that imposes a
signicant burden to a wide range of agencies—directive extraction—
can be addressed by deontic sentence classication in combination
with nested sentence disambiguation and semantic role labeling.
We anticipate that an ADEPT directive-extraction pilot will take
place in mid-2019 with a representative U.S. federal agency.
Automated analysis of policy documents presents a rich set of
text-analytic tasks but…ADEPT illustrates how a document analysis task that imposes a
signicant burden to a wide range of agencies—directive extraction—
can be addressed by deontic sentence classication in combination
with nested sentence disambiguation and semantic role labeling.
We anticipate that an ADEPT directive-extraction pilot will take
place in mid-2019 with a representative U.S. federal agency.
Automated analysis of policy documents presents a rich set of
text-analytic tasks but promises very signicant rewards to both
agencies and citizens. ADEPT represents an initial realization of this
approach to improving the administrative state through modern
computational linguistics techniques. -
Scalable Methods for Annotating Legal-Decision Corpora
Proceedings of the Natural Legal Language Processing Workshop 2019
Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain…
Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain name disputes, an approach which leverages a small amount of manual annotation and prototypical patterns present in the case documents to automatically extend feature labels to the entire corpus. To demonstrate the feasibility of this approach, we report results from systems trained on this dataset.
-
Event classification in foreign language aviation reports
International Journal of Knowledge Engineering and Data Mining
When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are…
When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are not aware of any that have done the same on foreign language narratives. We developed and implemented an approach for event classification based on Bayesian logistic regression and a novel feature selection technique. For comparison purposes, we also implemented an approach described in the literature. We collected and annotated a corpus of Japanese aviation incident reports, as well as, a corpus of French incident reports. We carried out a series of experiments comparing the accuracy of our approach and the other approach.
Other authorsSee publication -
Semantic edge labeling over legal citation graphs
Proceedings of the workshop on legal text, document, and corpus analytics (LTDCA-2016)
Citations, as in when a certain statute is being cited in another statute, differ in meaning, and we aim to annotate each edge with a semantic label that expresses this meaning or purpose. Our efforts involve defining, annotating and automatically assigning each citation edge with a specific semantic label.
Other authorsSee publication -
Discriminating Non-Native English with 350 Words
8th Workshop on Innovative Use of NLP for Building Educational Application at NAACL 2013
This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble…
This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble methods that we employed for combining these base systems.
Other authorsSee publication -
Author Attribution in US Supreme Court Decisions
JURIX 2011, “Frontiers in Artificial Intelligence and Applications”, IOS Press
This short paper establishes a baseline for author attribution in the domain of US Supreme Court decisions. It also examines the contribution of four different kinds of features and the size/accuracy tradeoffs that can be made.
-
Bootstrapping Multilingual Relation Discovery using English
23rd IEEE International Conference on Tools with Artificial Intelligence
In this paper, we describe a near-zero-cost methodology
to build relation extractors for significantly distinct non-
English languages using only freely available Wikipedia and
other web documents, and some knowledge of English.Other authorsSee publication -
TCAR at TAC-KBP 2009
NIST
The TCAR team developed multiple systems in just a matter of weeks for both participating in the TAC-KBP evaluation under the entity linking and the slot filling paradigms.
Other authorsSee publication
Courses
-
Machine Learning
-
-
Neural Networks
-
Honors & Awards
-
Outstanding Alumnus of the Year
Purdue University Computer Science Department
More activity by Craig
-
o3, so excited to start testing with this!
o3, so excited to start testing with this!
Liked by Craig Pfeifer
-
Everyone loves graphs. Well, most people. OK, fine, only nerds like graphs. But I’m a nerd and you probably are too. Nerd. We rely on Web Graphs…
Everyone loves graphs. Well, most people. OK, fine, only nerds like graphs. But I’m a nerd and you probably are too. Nerd. We rely on Web Graphs…
Liked by Craig Pfeifer
-
“Many in the executive branch don't think this way, but they should partner with the legislative branch to measure impact. There's an opportunity to…
“Many in the executive branch don't think this way, but they should partner with the legislative branch to measure impact. There's an opportunity to…
Liked by Craig Pfeifer
-
Open Source Generative AI Stack LLMs - Open-source LLMs like Llama, Qwen, Mistral, Phi and Gemma are free to use. Importantly, Llama and Qwen models…
Open Source Generative AI Stack LLMs - Open-source LLMs like Llama, Qwen, Mistral, Phi and Gemma are free to use. Importantly, Llama and Qwen models…
Liked by Craig Pfeifer
-
Here is an inspiring article about my distinguished colleague and friend, Dr. Milt Halem. Dr. Halem not only studies forces of nature…
Here is an inspiring article about my distinguished colleague and friend, Dr. Milt Halem. Dr. Halem not only studies forces of nature…
Liked by Craig Pfeifer
-
If your new year's resolution is to get up-to-speed in AI and justice, we've got you covered. Just a few spots left for in-person and virtual seats…
If your new year's resolution is to get up-to-speed in AI and justice, we've got you covered. Just a few spots left for in-person and virtual seats…
Liked by Craig Pfeifer
-
Bigger is not always better. This little 14B parameter model is outperforming larger models on complex mathematics tasks. Phi-4, released a few…
Bigger is not always better. This little 14B parameter model is outperforming larger models on complex mathematics tasks. Phi-4, released a few…
Liked by Craig Pfeifer
-
Our first program out of the Proactive Health Office at ARPA-H is now in the wild!
Our first program out of the Proactive Health Office at ARPA-H is now in the wild!
Liked by Craig Pfeifer
-
Earlier this year, the Erasmus.AI foundation, a center of excellence for Climate AI, released ClimateGPT (climategpt.ai). One of my Data Science &…
Earlier this year, the Erasmus.AI foundation, a center of excellence for Climate AI, released ClimateGPT (climategpt.ai). One of my Data Science &…
Liked by Craig Pfeifer
-
Join Purdue University College of Science as a Recruiting Specialist for majors in Purdue Computer Science! Inspire future #Boilermakers and create…
Join Purdue University College of Science as a Recruiting Specialist for majors in Purdue Computer Science! Inspire future #Boilermakers and create…
Liked by Craig Pfeifer
-
My favorite paper from NeurIPS’24 shows us that frontier LLMs don’t pay very close attention to their context windows… Needle In A Haystack: The…
My favorite paper from NeurIPS’24 shows us that frontier LLMs don’t pay very close attention to their context windows… Needle In A Haystack: The…
Liked by Craig Pfeifer
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Craig Pfeifer in United States
-
Craig Pfeifer
Director Sales Operations Support - Sig Sauer Electro-Optics
-
Craig Pfeifer
Operations Manager
-
Craig Pfeifer, CPA
-
Craig Pfeifer
owner of Maryland screen printers
19 others named Craig Pfeifer in United States are on LinkedIn
See others named Craig Pfeifer