Craig Pfeifer

Detroit Metropolitan Area

2K followers 500+ connections

View mutual connections with Craig

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to follow

Lightning AI

University of Maryland Baltimore County

About

I'm a staff machine learning engineer at Lightning.ai, working with the sales team to…

Activity

Llama-OCR : Llama-based Open-Source OCR Tool Llama-OCR is an open-source OCR tool based on Llama, the most popular open-source LLM. Llama-OCR is…

Llama-OCR : Llama-based Open-Source OCR Tool Llama-OCR is an open-source OCR tool based on Llama, the most popular open-source LLM. Llama-OCR is…

Liked by Craig Pfeifer
Turn any GitHub repository into LLM-ready text! Simply replace "hub" with "ingest" in a GitHub URL and receive a prompt-friendly text ingest for…

Turn any GitHub repository into LLM-ready text! Simply replace "hub" with "ingest" in a GitHub URL and receive a prompt-friendly text ingest for…

Liked by Craig Pfeifer
Hey there! We’re expanding our team at neptune.ai and looking for a Principal Solutions Architect https://lnkd.in/ejutQjka. The role is 100%…

Hey there! We’re expanding our team at neptune.ai and looking for a Principal Solutions Architect https://lnkd.in/ejutQjka. The role is 100%…

Liked by Craig Pfeifer

Join now to see all activity

Experience

Lightning AI

Detroit Metropolitan Area

Education

University of Maryland Baltimore County

2010 - 2017

coursework in machine learning, neural networks
research in domain adaptation in natural language processing
2002 - 2008

thesis in natural language processing: Using Word Sense Generation to Generate a Semantic Lexicon

coursework includes AI, Natural Language Processing, Knowledge Representation and Reasoning as well as old favorites Algorithms, Computer Architecture, and Operating Systems.
1992 - 1997

Activities and Societies: Association For Computing Machinery, Science Student Council, Purdue All American Marching Band, Residence Hall Counselor, University Orchestra

Undergraduate Research in mobile computing, data structures. Double minor in Creative Writing and Technical Writing.

Licenses & Certifications

Certified Enterprise Architect for J2EE

Sun Microsystems

Publications

Does Lawyering Matter? Predicting Judicial Decisions from Legal Briefs, and What That Means for Access to Justice

Texas Law Review May 1, 2022
This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal…

This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal research. However, good legal research is expensive, and the primacy of citations in our models raises concerns about access to justice. Here, our citation-based models also suggest promising solutions. We propose a freely available, computationally enabled citation identification and brief bank tool, which would extend to all litigants the benefits of good lawyering and open up access to justice.

Other authors
See publication
Semi-Supervised Methods for Explainable Legal Prediction

ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law June 17, 2019

Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The…

Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The first approach, which uses an Attention Network for prediction and attention weights to highlight salient case text, was shown to be capable of predicting decisions, but attention-weight-based text highlighting did not demonstrably improve human decision speed or accuracy in an evaluation with 61 human subjects. The second approach, termed SCALE (Semi-supervised Case Annotation for Legal Explanations), exploits structural and semantic regularities in case corpora to identify textual patterns that have both predictable relationships to case decisions and explanatory value.

See publication
ADEPT: Automated Directive Extraction from Policy Texts

ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law June 17, 2019

ADEPT illustrates how a document analysis task that imposes a
signicant burden to a wide range of agencies—directive extraction—
can be addressed by deontic sentence classication in combination
with nested sentence disambiguation and semantic role labeling.
We anticipate that an ADEPT directive-extraction pilot will take
place in mid-2019 with a representative U.S. federal agency.
Automated analysis of policy documents presents a rich set of
text-analytic tasks but…

ADEPT illustrates how a document analysis task that imposes a
signicant burden to a wide range of agencies—directive extraction—
can be addressed by deontic sentence classication in combination
with nested sentence disambiguation and semantic role labeling.
We anticipate that an ADEPT directive-extraction pilot will take
place in mid-2019 with a representative U.S. federal agency.
Automated analysis of policy documents presents a rich set of
text-analytic tasks but promises very signicant rewards to both
agencies and citizens. ADEPT represents an initial realization of this
approach to improving the administrative state through modern
computational linguistics techniques.

See publication
Scalable Methods for Annotating Legal-Decision Corpora

Proceedings of the Natural Legal Language Processing Workshop 2019 June 10, 2019

Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain…

Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain name disputes, an approach which leverages a small amount of manual annotation and prototypical patterns present in the case documents to automatically extend feature labels to the entire corpus. To demonstrate the feasibility of this approach, we report results from systems trained on this dataset.

See publication
Event classification in foreign language aviation reports

International Journal of Knowledge Engineering and Data Mining 2016
When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are…

When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are not aware of any that have done the same on foreign language narratives. We developed and implemented an approach for event classification based on Bayesian logistic regression and a novel feature selection technique. For comparison purposes, we also implemented an approach described in the literature. We collected and annotated a corpus of Japanese aviation incident reports, as well as, a corpus of French incident reports. We carried out a series of experiments comparing the accuracy of our approach and the other approach.

Other authors
See publication
Semantic edge labeling over legal citation graphs

Proceedings of the workshop on legal text, document, and corpus analytics (LTDCA-2016) 2016
Citations, as in when a certain statute is being cited in another statute, differ in meaning, and we aim to annotate each edge with a semantic label that expresses this meaning or purpose. Our efforts involve defining, annotating and automatically assigning each citation edge with a specific semantic label.

Other authors
See publication
Discriminating Non-Native English with 350 Words

8th Workshop on Innovative Use of NLP for Building Educational Application at NAACL 2013 May 2013
This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble…

This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble methods that we employed for combining these base systems.

Other authors
See publication
Author Attribution in US Supreme Court Decisions

JURIX 2011, “Frontiers in Artificial Intelligence and Applications”, IOS Press December 11, 2011

This short paper establishes a baseline for author attribution in the domain of US Supreme Court decisions. It also examines the contribution of four different kinds of features and the size/accuracy tradeoffs that can be made.

See publication
Bootstrapping Multilingual Relation Discovery using English

23rd IEEE International Conference on Tools with Artificial Intelligence November 1, 2011
In this paper, we describe a near-zero-cost methodology
to build relation extractors for significantly distinct non-
English languages using only freely available Wikipedia and
other web documents, and some knowledge of English.

Other authors
See publication
TCAR at TAC-KBP 2009

NIST November 1, 2009
The TCAR team developed multiple systems in just a matter of weeks for both participating in the TAC-KBP evaluation under the entity linking and the slot filling paradigms.

Other authors
See publication

Courses

Machine Learning

-
Neural Networks

-

Honors & Awards

Outstanding Alumnus of the Year

Purdue University Computer Science Department

Sep 2016

More activity by Craig

o3, so excited to start testing with this!

o3, so excited to start testing with this!

Liked by Craig Pfeifer
Everyone loves graphs. Well, most people. OK, fine, only nerds like graphs. But I’m a nerd and you probably are too. Nerd. We rely on Web Graphs…

Everyone loves graphs. Well, most people. OK, fine, only nerds like graphs. But I’m a nerd and you probably are too. Nerd. We rely on Web Graphs…

Liked by Craig Pfeifer
“Many in the executive branch don't think this way, but they should partner with the legislative branch to measure impact. There's an opportunity to…

“Many in the executive branch don't think this way, but they should partner with the legislative branch to measure impact. There's an opportunity to…

Liked by Craig Pfeifer
Open Source Generative AI Stack LLMs - Open-source LLMs like Llama, Qwen, Mistral, Phi and Gemma are free to use. Importantly, Llama and Qwen models…

Open Source Generative AI Stack LLMs - Open-source LLMs like Llama, Qwen, Mistral, Phi and Gemma are free to use. Importantly, Llama and Qwen models…

Liked by Craig Pfeifer
Here is an inspiring article about my distinguished colleague and friend, Dr. Milt Halem. Dr. Halem not only studies forces of nature…

Here is an inspiring article about my distinguished colleague and friend, Dr. Milt Halem. Dr. Halem not only studies forces of nature…

Liked by Craig Pfeifer
If your new year's resolution is to get up-to-speed in AI and justice, we've got you covered. Just a few spots left for in-person and virtual seats…

If your new year's resolution is to get up-to-speed in AI and justice, we've got you covered. Just a few spots left for in-person and virtual seats…

Liked by Craig Pfeifer
Bigger is not always better. This little 14B parameter model is outperforming larger models on complex mathematics tasks. Phi-4, released a few…

Bigger is not always better. This little 14B parameter model is outperforming larger models on complex mathematics tasks. Phi-4, released a few…

Liked by Craig Pfeifer
🔥🔥🔥

🔥🔥🔥

Liked by Craig Pfeifer
Our first program out of the Proactive Health Office at ARPA-H is now in the wild!

Our first program out of the Proactive Health Office at ARPA-H is now in the wild!

Liked by Craig Pfeifer
Out today - a great write up on our work using graph algorithms to understand Transformer reasoning: https://lnkd.in/e54V9ps4 paper:…

Out today - a great write up on our work using graph algorithms to understand Transformer reasoning: https://lnkd.in/e54V9ps4 paper:…

Liked by Craig Pfeifer
Earlier this year, the Erasmus.AI foundation, a center of excellence for Climate AI, released ClimateGPT (climategpt.ai). One of my Data Science &…

Earlier this year, the Erasmus.AI foundation, a center of excellence for Climate AI, released ClimateGPT (climategpt.ai). One of my Data Science &…

Liked by Craig Pfeifer
Join Purdue University College of Science as a Recruiting Specialist for majors in Purdue Computer Science! Inspire future #Boilermakers and create…

Join Purdue University College of Science as a Recruiting Specialist for majors in Purdue Computer Science! Inspire future #Boilermakers and create…

Liked by Craig Pfeifer
My favorite paper from NeurIPS’24 shows us that frontier LLMs don’t pay very close attention to their context windows… Needle In A Haystack: The…

My favorite paper from NeurIPS’24 shows us that frontier LLMs don’t pay very close attention to their context windows… Needle In A Haystack: The…

Liked by Craig Pfeifer

View Craig’s full profile

See who you know in common
Get introduced
Contact Craig directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Craig Pfeifer in United States

19 others named Craig Pfeifer in United States are on LinkedIn

See others named Craig Pfeifer

Craig Pfeifer

Detroit Metropolitan Area 2K followers 500+ connections

About

Activity

Llama-OCR : Llama-based Open-Source OCR Tool Llama-OCR is an open-source OCR tool based on Llama, the most popular open-source LLM. Llama-OCR is…

Liked by Craig Pfeifer

Turn any GitHub repository into LLM-ready text! Simply replace "hub" with "ingest" in a GitHub URL and receive a prompt-friendly text ingest for…

Liked by Craig Pfeifer

Hey there! We’re expanding our team at neptune.ai and looking for a Principal Solutions Architect https://lnkd.in/ejutQjka. The role is 100%…

Liked by Craig Pfeifer

Experience

Education

Licenses & Certifications

Certified Enterprise Architect for J2EE

Sun Microsystems

Publications

Texas Law Review May 1, 2022

ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law June 17, 2019

ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law June 17, 2019

Proceedings of the Natural Legal Language Processing Workshop 2019 June 10, 2019

International Journal of Knowledge Engineering and Data Mining 2016

Proceedings of the workshop on legal text, document, and corpus analytics (LTDCA-2016) 2016

8th Workshop on Innovative Use of NLP for Building Educational Application at NAACL 2013 May 2013

JURIX 2011, “Frontiers in Artificial Intelligence and Applications”, IOS Press December 11, 2011

23rd IEEE International Conference on Tools with Artificial Intelligence November 1, 2011

NIST November 1, 2009

Courses

Machine Learning

-

Neural Networks

-

Honors & Awards

Outstanding Alumnus of the Year

Purdue University Computer Science Department

More activity by Craig

o3, so excited to start testing with this!

Liked by Craig Pfeifer

Everyone loves graphs. Well, most people. OK, fine, only nerds like graphs. But I’m a nerd and you probably are too. Nerd. We rely on Web Graphs…

Liked by Craig Pfeifer

“Many in the executive branch don't think this way, but they should partner with the legislative branch to measure impact. There's an opportunity to…

Liked by Craig Pfeifer

Open Source Generative AI Stack LLMs - Open-source LLMs like Llama, Qwen, Mistral, Phi and Gemma are free to use. Importantly, Llama and Qwen models…

Liked by Craig Pfeifer

Here is an inspiring article about my distinguished colleague and friend, Dr. Milt Halem. Dr. Halem not only studies forces of nature…

Liked by Craig Pfeifer

If your new year's resolution is to get up-to-speed in AI and justice, we've got you covered. Just a few spots left for in-person and virtual seats…

Liked by Craig Pfeifer

Bigger is not always better. This little 14B parameter model is outperforming larger models on complex mathematics tasks. Phi-4, released a few…

Liked by Craig Pfeifer

🔥🔥🔥

Liked by Craig Pfeifer

Our first program out of the Proactive Health Office at ARPA-H is now in the wild!

Liked by Craig Pfeifer

Out today - a great write up on our work using graph algorithms to understand Transformer reasoning: https://lnkd.in/e54V9ps4 paper:…

Liked by Craig Pfeifer

Earlier this year, the Erasmus.AI foundation, a center of excellence for Climate AI, released ClimateGPT (climategpt.ai). One of my Data Science &…

Liked by Craig Pfeifer

Join Purdue University College of Science as a Recruiting Specialist for majors in Purdue Computer Science! Inspire future #Boilermakers and create…

Liked by Craig Pfeifer

My favorite paper from NeurIPS’24 shows us that frontier LLMs don’t pay very close attention to their context windows… Needle In A Haystack: The…

Liked by Craig Pfeifer

View Craig’s full profile

Other similar profiles

Gaurav Singh Tomar

Yinsen Miao, Ph.D.

Trevor Santarra

Shahin Shayandeh

Irtsam Ghazi

Holly Yanco

Mazin Assanie

Hakan Ceylan

Hamid Mousavi

Steve Widergren

Minu Mathew

Altaf Rahman

Manohar Paluri

Maryana Alegro, PhD

David Petrou

Murat Özbayoğlu

Dragos Margineantu

Detroit Metropolitan Area

2K followers 500+ connections