MantisNLP

IT Services and IT Consulting

Specialist consultancy in Generative AI | Natural Language Processing | AI Development, Consulting and Due Diligence

Discover all 8 employees

About us

Mantis NLP is an AI consultancy specialising in Generative AI and Natural Language Processing. We can provide advice for your data needs, integrate or embed into your AI project to provide practical support and develop, build and deploy the most relevant machine learning and deep learning techniques to solve your problem. We are committed to reduce ethical risks in AI applications and be active members of the open source community.

Website: http://www.mantisnlp.com
External link for MantisNLP
Industry: IT Services and IT Consulting
Company size: 2-10 employees
Headquarters: Limassol
Type: Privately Held
Founded: 2021
Specialties: Natural Language Processing, Artificial Intelligence, Machine Learning, and MLOps

Locations

Primary

Chrysorrogiatissis and Kolokotroni

Limassol, 3040, CY

Get directions
London, GB

Get directions

Employees at MantisNLP

See all employees

Updates

MantisNLP

4,418 followers
2d
Report this post
📊 Evaluating RAG might be simpler than you think In fact you might not even need evaluation data, since you can use LLMs to do some of the heavy lifting. Here are some metrics you can calculate quite easily 💁 Helpfulness: Is the response helpful? 🎯 Relevance: Is the retrieved context in RAG relevant? 🤥 Faithfulness: Is the response truthful given the context? LLM evaluation is quite a hot topic and there are multiple frameworks and platforms built to enable you to evaluate with little to no code required. This is an interesting guide for one of them https://lnkd.in/ecyUPy6E

Dipanjan S.

Head of Community • Principal AI Scientist • Google Developer Expert & Cloud Champion Innovator • Author
1mo

Most articles and discussions are on building RAG Systems but don't forget to evaluate these systems when building. Here's my updated comprehensive guide on the most common RAG Evaluation Metrics. This guide has the following: - Explanation of Key Metrics in a RAG Workflow - Focus on Retrieval Evaluation Metrics - Context Precision, Recall, Relevancy - Focus on LLM Generation Evaluation Metrics - Answer Relevancy, Faithfulness, Hallucination Check, Custom LLM as a Judge - Detailed mathematical definition of each metric with explanation - Worked out example for each metric - Hands-on code of how to use these Do check this out and share with others if you find it useful!

1 Comment

Like Comment Share
MantisNLP

4,418 followers
3d
Report this post
Bites from last week AI news 🍪 1/ Microsoft releases phi-4, a 14B model with performance similar to some frontier models 🚀 https://lnkd.in/eZnXjYT9 2/ Google started rolling out its Gemini 2.0 models - beginning with Flash, which seems to be better than 1.5 Pro ⚡ The 2.0 series is advertised to be optimised for agentic workflows, some examples of which were shown through Project Astra and Mariner. https://lnkd.in/ghJZR66c 3/ Agentic workflows are finding their way into mainstream applications. Agentforce in Salesforce is one good example: it enables their customers to build customer support and sales agents that deal with some of the actions that can be automated 🤖 https://lnkd.in/gMiH_FHB 4/ Meta introduced a new transformer architecture that dynamically creates byte patches instead of tokens based on how probable the next byte is 🧪 This representation seems to scale better since it does not rely on a fixed vocabulary of tokens. https://lnkd.in/d4tf2puZ 5/ Meta released Llama 3.3 70b which delivers the same performance as their flagship 405B in version 3.2 at a much lower cost 🔥 It matches GPT4o, Claude 3.5, Gemini 1.5 Pro and Nova Pro in a couple of benchmarks. https://lnkd.in/ghyZu6UZ
Like Comment Share
MantisNLP

4,418 followers
1w
Report this post
Here is how we used LLMs to digitise our accounting data 🚀 https://lnkd.in/ea4N5YJu

Digitizing Unstructured Accounting Data with LLMs

medium.com

Like Comment Share
MantisNLP

4,418 followers
1w
Report this post
⚡ AI inference is speeding up Transformers, the underlying technology behind today’s AI, arenotoriously slow at inference. Remember when ChatGPT launched and you had to wait for a couple of seconds while your answer was generated one word at a time? During that time generation speed was approximately 1-10 tokens per seconds with the possibility for a 10x increase after quantisation and other optimisations 🐌 Noticeably, AI response are almost instant nowadays. At the same time, there is a race happening at the hardware level for the provider that can run AI the fastest, with two of the most prominent players being Cerebras and Groq 🔥 A few months ago Groq broke into the scene with an advertised 100 tokens per second generation speed for Llama 70B which has now increased to 250 👌 And while this is extremely fast for such a large model, Cerebras recently announced a speed of 2200 tokens per second for the same model 😮 On the surface such speeds may seem irrelevant for your application, but this is not entirely true since AI applications nowadays consist of multiple AI calls and components. Those speeds can enable building more complex solutions that still feel instant to the user. They also allow to improve existing solutions by allowing the model to “think more” for the same time 🚀
Like Comment Share
MantisNLP

4,418 followers
1w
Report this post
Bites from last week AI news 🍪 1/ Amazon releases Nova, a series of multi-modal models that push the boundary of cost-efficient intelligence 💸 https://lnkd.in/eqJ57Yt2 2/ OpenAI's o1 exits preview and the final model seems quite close to the preview version 😐 https://lnkd.in/gnyaTZyB 3/ Pydantic releases its agent framework PydanticAI, setting out to formalise inputs, outputs and tools used in agentic workflows 👌 https://lnkd.in/ezZ2z-W9 4/ Researchers around the world collaborated in training a 10B model in a distributed fashion 😮 This opens the door for frontier models to trained and released completely in the open given enough contributors joining forces. https://lnkd.in/gUqnbTEe
Like Comment Share
MantisNLP

4,418 followers
2w
Report this post
⛄️ The AI winter is coming along with some news and a case study on using your voice to control a medical simulation 🩺 👉 https://lnkd.in/dSfbDFxG

The AI winter

thetoken.substack.com

1 Comment

Like Comment Share
MantisNLP

4,418 followers
2w
Report this post
🤔 To quantize or not? As the race to ever higher compression of LLM parameters continues, it is worth asking the question: what is the impact this compression has on performance and which models are affected more than others? There is no free lunch here, compressing the parameters essentially reduces the ability of the model to learn, in essence its effective parameter count. The effect is more evident in models that have reached close to peak performance for their size i.e. small models trained on many tokens. Those models use effectively the full representation power they are given be it 16 bit to whatever precision they were trained on and quantising them, makes them lose important information. Importantly the performance penalty happens only in cases where the precision used for inference is smaller than the one the model was trained on i.e. trained in 16 bit and quantised in 4 bits. So to the extent the same precision is used for both training and inference, there is no problem with lower precision. Of course you can anticipate a higher precision model to perform better, all things similar otherwise. So here are the two takeaways to remember in regards to quantisation 1️⃣ Do not quantise small models trained on many tokens, anything state of the art below 20B. Do quantise larger models, anything above 20B 2️⃣ If you end up using a small model trained on many tokens in low precision, its best to find one that has been trained in low precision 🔗 Read more in the scaling law for precision paper https://lnkd.in/e7r7GEXR
2 Comments

Like Comment Share
MantisNLP

4,418 followers
2w
Report this post
Bites from last weeks AI news 🍪 1/ Mistral announced Pixtral large, it’s multimodal offering that seems on par with state-of-the-art models 🔥 https://lnkd.in/g5fDyH8s 2/ Cerebras clocks 1000 tokens per second for the largest Llama 😮 For comparison, Groq, another leading AI inference solution advertises a speed of 736 t/s 🐌 for only the small Llama 🦙 https://lnkd.in/g-RGjf9Q 3/ Anthropic now offers you the ability to decipher your writing style from previous writings and adjust Claude to write like you ✍️ https://lnkd.in/eRZFD4Q6
Like Comment Share
MantisNLP

4,418 followers
4w
Report this post
📢 The November issue of our newsletter “The Token” is out. We summarise the most important news of the month and share a case study on the medical domain 🩺 and a blog on customising AI aimed at executives. 👉 Read the entire issue here https://lnkd.in/e-eREiyD

Customising AI

thetoken.substack.com

Like Comment Share
MantisNLP

4,418 followers
1mo
Report this post
📚 Fast and accurate tool to parse technical PDF documents Parsing documents written for humans - such as scientific papers, policy documents and patents - is a well established use case of AI aiming to make the information inside those documents structured and usable. Up until now yοu could use either a specialised model that worked only in some cases or an LLM that was more general but failed often depending on the document format. It seems that we may have the best of both worlds with Docling 🦆: a new tool, based on a layout- and table-aware architecture, but scaled to a large enough dataset to be more accurate and fast 🔥 It is also open source and easy to use with a few lines of code. Definitely worth trying it as a component of your RAG system or information extraction pipeline. 🔗 Read more in the technical report https://lnkd.in/egQZszDi
Like Comment Share

MantisNLP

IT Services and IT Consulting

Specialist consultancy in Generative AI | Natural Language Processing | AI Development, Consulting and Due Diligence

About us

Locations

Employees at MantisNLP

Danil Mikhailov

Executive Director at DataDotOrg

Daniel Popescu

Senior AI Engineer | NLP specialized

Raphael Mitsch

NLP Engineer at MantisNLP

Updates

Digitizing Unstructured Accounting Data with LLMs

medium.com

The AI winter

thetoken.substack.com

Customising AI

thetoken.substack.com

Join now to see what you are missing

Similar pages

Argilla

Explosion

Jumping Rivers Ltd

GameForge AI

MindsDB

deepset

Qdrant

Giskard

Bonsai Labs

LlamaIndex