Craig Pfeifer

Craig Pfeifer

Detroit Metropolitan Area
2K followers 500+ connections

About

I'm a staff machine learning engineer at Lightning.ai, working with the sales team to…

Activity

Join now to see all activity

Experience

Education

  • University of Maryland Baltimore County Graphic

    University of Maryland Baltimore County

    -

    coursework in machine learning, neural networks
    research in domain adaptation in natural language processing

  • -

    thesis in natural language processing: Using Word Sense Generation to Generate a Semantic Lexicon

    coursework includes AI, Natural Language Processing, Knowledge Representation and Reasoning as well as old favorites Algorithms, Computer Architecture, and Operating Systems.

  • -

    Activities and Societies: Association For Computing Machinery, Science Student Council, Purdue All American Marching Band, Residence Hall Counselor, University Orchestra

    Undergraduate Research in mobile computing, data structures. Double minor in Creative Writing and Technical Writing.

Licenses & Certifications

  • Certified Enterprise Architect for J2EE

    Sun Microsystems

Publications

  • Does Lawyering Matter? Predicting Judicial Decisions from Legal Briefs, and What That Means for Access to Justice

    Texas Law Review

    This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal…

    This study uses linguistic analysis and machine-learning techniques to predict summary judgment outcomes from the text of the briefs filed by parties in a matter. We test the predictive power of textual characteristics, stylistic features, and citation usage, and we find that citations to precedent—their frequency, their patterns, and their popularity in other briefs—are the most predictive of a summary judgment win. This finding suggests that good lawyering may boil down to good legal research. However, good legal research is expensive, and the primacy of citations in our models raises concerns about access to justice. Here, our citation-based models also suggest promising solutions. We propose a freely available, computationally enabled citation identification and brief bank tool, which would extend to all litigants the benefits of good lawyering and open up access to justice.

    Other authors
    See publication
  • Semi-Supervised Methods for Explainable Legal Prediction

    ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law

    Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The…

    Legal decision-support systems have the potential to improve access to justice, administrative efficiency, and judicial consistency, but broad adoption of such systems is contingent on development of technologies with low knowledge-engineering, validation, and maintenance costs. This paper describes two approaches to an important form of legal decision support---explainable outcome prediction---that obviate both annotation of an entire decision corpus and manual processing of new cases. The first approach, which uses an Attention Network for prediction and attention weights to highlight salient case text, was shown to be capable of predicting decisions, but attention-weight-based text highlighting did not demonstrably improve human decision speed or accuracy in an evaluation with 61 human subjects. The second approach, termed SCALE (Semi-supervised Case Annotation for Legal Explanations), exploits structural and semantic regularities in case corpora to identify textual patterns that have both predictable relationships to case decisions and explanatory value.

    See publication
  • ADEPT: Automated Directive Extraction from Policy Texts

    ICAIL '19 Proceedings of the Seventeenth International Conference on Artificial Intelligence and Law

    ADEPT illustrates how a document analysis task that imposes a
    signicant burden to a wide range of agencies—directive extraction—
    can be addressed by deontic sentence classication in combination
    with nested sentence disambiguation and semantic role labeling.
    We anticipate that an ADEPT directive-extraction pilot will take
    place in mid-2019 with a representative U.S. federal agency.
    Automated analysis of policy documents presents a rich set of
    text-analytic tasks but…

    ADEPT illustrates how a document analysis task that imposes a
    signicant burden to a wide range of agencies—directive extraction—
    can be addressed by deontic sentence classication in combination
    with nested sentence disambiguation and semantic role labeling.
    We anticipate that an ADEPT directive-extraction pilot will take
    place in mid-2019 with a representative U.S. federal agency.
    Automated analysis of policy documents presents a rich set of
    text-analytic tasks but promises very signicant rewards to both
    agencies and citizens. ADEPT represents an initial realization of this
    approach to improving the administrative state through modern
    computational linguistics techniques.

    See publication
  • Scalable Methods for Annotating Legal-Decision Corpora

    Proceedings of the Natural Legal Language Processing Workshop 2019

    Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain…

    Recent research has demonstrated that judicial and administrative decisions can be predicted by machine-learning models trained on prior decisions. However, to have any practical application, these predictions must be explainable, which in turn requires modeling a rich set of features. Such approaches face a roadblock if the knowledge engineering required to create these features is not scalable. We present an approach to developing a feature-rich corpus of administrative rulings about domain name disputes, an approach which leverages a small amount of manual annotation and prototypical patterns present in the case documents to automatically extend feature labels to the entire corpus. To demonstrate the feasibility of this approach, we report results from systems trained on this dataset.

    See publication
  • Event classification in foreign language aviation reports

    International Journal of Knowledge Engineering and Data Mining

    When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are…

    When adverse aviation events occur, narrative reports describing the events and their associated flights provide a valuable record for improving safety. Manual examination of large collections of such reports is challenging. Tools for automated event classification (assignment of type labels to individual reports) can help to mitigate this challenge. While several studies have developed and systematically empirically evaluated event classification tools on English aviation narratives, we are not aware of any that have done the same on foreign language narratives. We developed and implemented an approach for event classification based on Bayesian logistic regression and a novel feature selection technique. For comparison purposes, we also implemented an approach described in the literature. We collected and annotated a corpus of Japanese aviation incident reports, as well as, a corpus of French incident reports. We carried out a series of experiments comparing the accuracy of our approach and the other approach.

    Other authors
    See publication
  • Semantic edge labeling over legal citation graphs

    Proceedings of the workshop on legal text, document, and corpus analytics (LTDCA-2016)

    Citations, as in when a certain statute is being cited in another statute, differ in meaning, and we aim to annotate each edge with a semantic label that expresses this meaning or purpose. Our efforts involve defining, annotating and automatically assigning each citation edge with a specific semantic label.

    Other authors
    See publication
  • Discriminating Non-Native English with 350 Words

    8th Workshop on Innovative Use of NLP for Building Educational Application at NAACL 2013

    This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble…

    This paper describes MITRE’s participation in the native language identification (NLI) task at BEA-8. Our best effort performed at an accuracy of 82.6% in the eleven-way NLI task, placing it in a statistical tie with the best performing systems. We describe a variety of machine learning approaches that we explored, including Winnow, language modeling, logistic regression and maximum-entropy models. Our primary features were simple word and character n-grams. We also describe several ensemble methods that we employed for combining these base systems.

    Other authors
    See publication
  • Author Attribution in US Supreme Court Decisions

    JURIX 2011, “Frontiers in Artificial Intelligence and Applications”, IOS Press

    This short paper establishes a baseline for author attribution in the domain of US Supreme Court decisions. It also examines the contribution of four different kinds of features and the size/accuracy tradeoffs that can be made.

    See publication
  • Bootstrapping Multilingual Relation Discovery using English

    23rd IEEE International Conference on Tools with Artificial Intelligence

    In this paper, we describe a near-zero-cost methodology
    to build relation extractors for significantly distinct non-
    English languages using only freely available Wikipedia and
    other web documents, and some knowledge of English.

    Other authors
    See publication
  • TCAR at TAC-KBP 2009

    NIST

    The TCAR team developed multiple systems in just a matter of weeks for both participating in the TAC-KBP evaluation under the entity linking and the slot filling paradigms.

    Other authors
    See publication

Courses

  • Machine Learning

    -

  • Neural Networks

    -

Honors & Awards

  • Outstanding Alumnus of the Year

    Purdue University Computer Science Department

More activity by Craig

View Craig’s full profile

  • See who you know in common
  • Get introduced
  • Contact Craig directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Craig Pfeifer in United States