Jonathan Siddharth

San Francisco Bay Area

29K followers 500+ connections

View mutual connections with Jonathan

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to follow

Turing.com

Stanford University

About

Unleashing the world’s untapped human potential to accelerate AGI.

The bottleneck…

Activity

I’ve closed more than $500M in contact center AI deals, and here’s the truth that skeptics overlook: the fastest route to ROI in today’s call centers…

I’ve closed more than $500M in contact center AI deals, and here’s the truth that skeptics overlook: the fastest route to ROI in today’s call centers…

Liked by Jonathan Siddharth
Last week, I talked about how clearing out all the threads from our brain before bedtime can help us with deep sleep. Now for the paradox. After…

Last week, I talked about how clearing out all the threads from our brain before bedtime can help us with deep sleep. Now for the paradox. After…

Liked by Jonathan Siddharth
History made that may never be overwritten. At 18 years, D Gukesh became the youngest chess champion and shattered Gary Kasparov’s record by four…

History made that may never be overwritten. At 18 years, D Gukesh became the youngest chess champion and shattered Gary Kasparov’s record by four…

Liked by Jonathan Siddharth

Join now to see all activity

Experience

Turing.com

Palo Alto, California
-
-
-

Menlo Park
-

Sunnyvale, California
-

Sunnyvale, CA
-
-

Santa Clara

Education

Stanford University

Awarded the Christopher Stephenson Memorial Award for Best Masters Research in the Computer Science Department at Stanford University

Research Assistant at the Stanford InfoLab

Artificial Intelligence Track

Collaborated on a Research Project between the Stanford Artificial Intelligence Lab (Rion Snow & Andrew Ng) with Powerset
Graduated at the top of my class (1st Rank) in the Computer Science Department at SVCE

Published my first peer reviewed IEEE paper on Artificial Neural Networks for Self Driving Cars as a Sophomore. Presented my work at the IEEE Conference on A.I in Singapore.

Merit Awards for 1st Rank in Computer Science (semesters 6,8 and overall)

CAT Prize- 1st Rank in Continuous Assessment tests (semesters 5,6,7,8)

Lucas TVS Merit Award

Publications

SpotSigs: Robust and Efficient Near Duplicate Detection in. Large Web Collections.

ACM SIGIR May 29, 2008
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching signatures for near duplicate detection in large Web crawls. Our spot signatures are designed to favor naturallanguage portions of Web pages over advertisements and navigational bars. The contributions of SpotSigs are twofold: 1) by combining stopword antecedents with short chains of adjacent content terms, we create…

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching signatures for near duplicate detection in large Web crawls. Our spot signatures are designed to favor naturallanguage portions of Web pages over advertisements and navigational bars. The contributions of SpotSigs are twofold: 1) by combining stopword antecedents with short chains of adjacent content terms, we create robust document signatures with a natural ability to filter out noisy components of Web pages that would otherwise distract pure n-gram-based approaches such as Shingling; 2) we provide an exact and efficient, self- tuning matching algorithm that exploits a novel combination of collection partitioning and inverted index pruning for high-dimensional similarity search. Experiments confirm an increase in combined precision and recall of more than 24 percent over state-of-the-art approaches such as Shingling or I-Match and up to a factor of 3 faster execution times than Locality Sensitive Hashing (LSH), over a demonstrative "Gold Set" of manually assessed near-duplicate news articles as well as the TREC WT10g Web collection.

Other authors
See publication
SpotSigs: Near Duplicate Detection in Web Page Collections

Master's Thesis (Best Thesis Award in Computer Science at Stanford University) June 15, 2007
Motivated by our work with political scientists we present an algorithm that detects near-duplicate Web pages. These scientists analyze Web archives of news sites. The archives were collected with crawlers and contain a large number of pages that look very different because the frame around their core content differs. However, the news stories in the pages are nearly identical. The close proximity of unrelated items on the pages makes the detection of content overlap difficult. Our SpotSigs…

Motivated by our work with political scientists we present an algorithm that detects near-duplicate Web pages. These scientists analyze Web archives of news sites. The archives were collected with crawlers and contain a large number of pages that look very different because the frame around their core content differs. However, the news stories in the pages are nearly identical. The close proximity of unrelated items on the pages makes the detection of content overlap difficult. Our SpotSigs algorithm generates signatures that are spread across each document. Places for these signatures are determined by the placement of common words, like 'is' and 'the' in the documents. We can vary our method of computing the signatures. Using hash collisions the algorithm detects overlap among the signatures of matching contents. We study how the different SpotSigs parameters impact precision and recall performance. We propose and evaluate variants of SpotSigs on a test bed of 2168 Web Pages and study the tradeoffs involved. One of our motivations was also to keep pre-processing requirements low for the detection of near duplicates and to this end we do not remove ads, client side scripts and other HTML formatting elements from the documents. On this data set SpotSigs obtains a precision of over 93% and a recall of over 85% for near duplicate detection.

Other authors
See publication
SQUINT - SVM for Identification of Relevant Sections in Web Pages for Web Search

Machine Learning Course Project (CS229) August 1, 2006

We propose SQUINT – an SVM based approach to identify sections (paragraphs) of a Web page that are relevant to a query in Web Search. SQUINT works by generating features from the top most relevant results returned in response to a query from a Web Search Engine, to learn more about the query and its context. It then uses an SVM with a linear kernel to score sections of a Web
page based on these features. One application of SQUINT we can think of is some form of highlighting of the sections…

We propose SQUINT – an SVM based approach to identify sections (paragraphs) of a Web page that are relevant to a query in Web Search. SQUINT works by generating features from the top most relevant results returned in response to a query from a Web Search Engine, to learn more about the query and its context. It then uses an SVM with a linear kernel to score sections of a Web
page based on these features. One application of SQUINT we can think of is some form of highlighting of the sections to indicate which section is most likely to be interesting to the user given his
query. If the result page has a lot of (possibly diverse) content sections, this could be very useful to the user in terms of reducing his time to get the information he needs. Another advantage of this
scheme as compared to simple search term highlighting is that, it would even score sections which do not mention the key word at all. We also think SQUINT could be used to generate better
summaries for queries in Web Search. One can also envision SQUINT as being able to create succinct summaries of pages of results, by pulling out the most relevant section in each page and
creating a meta summary page of the results. The training set for SQUINT is generated by querying a Web Search Engine and hand labelling sections. Preliminary evaluations of SQUINT by K-fold
cross validation appear promising. We also analyzed the effect of feature dimensionality reduction on performance. We conclude with some insights into the problem and possible directions for future research.

See publication
Context Driven Ranking for Information Retrieval

Stanford InfoLab Independent Research under Prof. Hector Garcia-Molina & Dr. Andreas Paepcke January 1, 2006

Improving search relevance by obtaining more ‘context’ (contextually related words) automatically for the search query, weighting it appropriately and using it to improve search relevance on the Discounted Cumulative Gain metric. Eg. For the search query "photography", contextually related words would be "pictures", "camera","film" etc. The presence of these contextually related words in a document is scored positively for relevance to the query.

See publication
Knowledge discovery in Clinical Databases with Neural Network Evidence Combination

Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005. January 4, 2005

Diagnosis of diseases and disorders afflicting mankind has always been a candidate for automation. Numerous attempts made at classification of symptoms and characteristic features of disorders have rarely used neural networks due to the inherent difficulty of training with sufficient data. But, the inherent robustness of neural networks and their adaptability in varying relationships of input and output justifies their use in clinical databases. To overcome the problem of training under…

Diagnosis of diseases and disorders afflicting mankind has always been a candidate for automation. Numerous attempts made at classification of symptoms and characteristic features of disorders have rarely used neural networks due to the inherent difficulty of training with sufficient data. But, the inherent robustness of neural networks and their adaptability in varying relationships of input and output justifies their use in clinical databases. To overcome the problem of training under conditions of insufficient and incomplete data, we propose to use three different neural network classifiers, each using a different learning function. Consequent combination of their beliefs by Dempster-Shafer evidence combination overcomes weaknesses exhibited by any one classifier to a particular training set. We prove with conclusive evidence that such an approach would provide a significantly higher accuracy in the diagnosis of disorders and diseases.

See publication
A Swarm Intelligence based Task Allocation Algorithm (SITA)

Senior Thesis Research Report, Best Paper Award at Abacus ’05 National level Tech Symposium at Anna University 2005

This paper proposes the use of a Swarm Intelligence based approach (SITA) for Task Allocation and scheduling in a dynamically reconfigurable environment such as the computational Grid. SITA is a massively distributed task allocation algorithm that draws inspiration from the hugely efficient foraging and food hunting paradigm of ants. We employ the ant colony optimization (ACO), a population based search technique for the solution of combinatorial optimization problems for resource discovery in…

This paper proposes the use of a Swarm Intelligence based approach (SITA) for Task Allocation and scheduling in a dynamically reconfigurable environment such as the computational Grid. SITA is a massively distributed task allocation algorithm that draws inspiration from the hugely efficient foraging and food hunting paradigm of ants. We employ the ant colony optimization (ACO), a population based search technique for the solution of combinatorial optimization problems for resource discovery in the Grid. Making use of evaporating pheromone trails, the algorithm adapts effortlessly to transient network conditions like congestion, node failure, link failure etc. The use of the distributed agents (ants) working in parallel and independent of each other for resource discovery obviates the need to maintain global state across all nodes. This leads to substantial savings in memory requirements. For our analysis we considered a constraint satisfaction scenario where the objective is to optimize the often conflicting parameters of cost and time where cost is the cost of utilizing a particular Grid resource and time is the time spent in task allocation. A detailed performance analysis is also presented where we analyze the effect of various parameter settings on SITA to better understand the factors on which good allocation depends.

See publication
A System for Power-aware Agent-based Intrusion Detection (SPAID) in Wireless Ad hoc Networks

Networking and Mobile Computing. Springer Berlin Heidelberg, 2005. 153-162. APA 2005

In this paper, we propose a distributed hierarchical intrusion detection system for ad hoc wireless networks, based on a power level metric for potential ad hoc hosts, which is used to determine the duration for which a particular node can support a network monitoring node. We propose an iterative power-aware power-optimal solution to identifying nodes for distributed agent-based intrusion detection. The advantages that our approach entails are several, not least of which is the inherent…

In this paper, we propose a distributed hierarchical intrusion detection system for ad hoc wireless networks, based on a power level metric for potential ad hoc hosts, which is used to determine the duration for which a particular node can support a network monitoring node. We propose an iterative power-aware power-optimal solution to identifying nodes for distributed agent-based intrusion detection. The advantages that our approach entails are several, not least of which is the inherent flexibility SPAID provides. We consider minimally mobile networks in this paper, and considerations apt for mobile ad hoc networks and issues related to dynamism are earmarked for future research. Comprehensive simulations were carried out to analyze and clearly delineate the variations in performance with changing density of wireless networks, and the effect of parametric variations such as hop-radius.

See publication
Sentient Autonomous Vehicle using Advanced Neural Net Technology

IEEE International Conference on Cybernetics and Intelligent Systems - CIS 2004 December 1, 2004

SAVANT uses a multi-layer feed-forward neural network with back propagation learning to guide a mobile agent through a hostile and unfamiliar domain after being trained by a human user with domain expertise. The system learns to negotiate turns and implement lane-changing maneuvers to avoid or overtake obstacles.

See publication
A Minimal Fragmentation Algorithm for Task Allocation in Mesh-Connected Multicomputers

Proceedings of the IEEE International Conference on Advances in Intelligent Systems - Theory and Applications -AISTA 2004 2004

Efficient allocation of processors to incoming tasks in tightly coupled systems is crucial for achieving high performance. A good allocation algorithm should identify available processors with minimum overhead. In addition, it should be submesh recognition complete and should minimize fragmentation as far as possible. In this paper, we propose an efficient task allocation mechanism called the Minimal
Fragmentation Algorithm (MFA). By weighting the available nodes on the basis of their…

Efficient allocation of processors to incoming tasks in tightly coupled systems is crucial for achieving high performance. A good allocation algorithm should identify available processors with minimum overhead. In addition, it should be submesh recognition complete and should minimize fragmentation as far as possible. In this paper, we propose an efficient task allocation mechanism called the Minimal
Fragmentation Algorithm (MFA). By weighting the available nodes on the basis of their adjacency to existing busy submeshes or the mesh boundary, we identify nodes that, if chosen as the base for task allocation, would result in minimal external fragmentation. An analysis of the complexity of the proposed algorithm reveals that our scheme provides highly competitive performance.

See publication

Organizations

Plug and Play Startup Camp

Mentor

2010 - Present

Mentoring and advising Seed Stage Startups that have been selected as part of Startup Camp, an accelerator backed by Amidzad Ventures and the Plug and Play Tech Center. http://plugandplaytechcenter.com/startupcamp/
Stanford IEEE Student Chapter

Officer & Industry Liaison

2008 - 2010
Stanford India Association

Core Planning Group

2007 - 2007

Recommendations received

2 people have recommended Jonathan

Join now to view

More activity by Jonathan

Great talk by Ilya Sutskever at NeurIPS. Pre-training data hit a wall since we used up all the internet tokens and synthetic data hasn’t been super…

Great talk by Ilya Sutskever at NeurIPS. Pre-training data hit a wall since we used up all the internet tokens and synthetic data hasn’t been super…

Shared by Jonathan Siddharth
Turing is having a 💥 at NeurIPS with full house to hear from Jeff Dean at our #AGI icon 3 event. #Turing 🤝 #Google

Turing is having a 💥 at NeurIPS with full house to hear from Jeff Dean at our #AGI icon 3 event. #Turing 🤝 #Google

Liked by Jonathan Siddharth
There are days that are special and unforgettable in one’s life; the day you get married, birthday of your children, the first day of your first…

There are days that are special and unforgettable in one’s life; the day you get married, birthday of your children, the first day of your first…

Liked by Jonathan Siddharth
AGI Icons 3 is live at #neurips2024! #agiicons Turing

AGI Icons 3 is live at #neurips2024! #agiicons Turing

Liked by Jonathan Siddharth
I am grateful and delighted to be appointed the Misra Family Professor in the School of Engineering and Applied Science at the University of…

I am grateful and delighted to be appointed the Misra Family Professor in the School of Engineering and Applied Science at the University of…

Liked by Jonathan Siddharth
Yesterday was an evening well spent with #AI leaders, practitioners and enthusiasts attending Turing 🔥 side chat panel event moderated by Anna…

Yesterday was an evening well spent with #AI leaders, practitioners and enthusiasts attending Turing 🔥 side chat panel event moderated by Anna…

Liked by Jonathan Siddharth
Thrilled to share a major milestone for Fiddler AI! We’ve raised an $18.6M Series B extension, bringing our total Series B funding to $50M. This…

Thrilled to share a major milestone for Fiddler AI! We’ve raised an $18.6M Series B extension, bringing our total Series B funding to $50M. This…

Liked by Jonathan Siddharth
Big announcement 📣 - Turing’s next AGI Icons event will feature Jeff Dean 🎉 Turing’s ethos is to honor iconic engineers and their work, and they…

Big announcement 📣 - Turing’s next AGI Icons event will feature Jeff Dean 🎉 Turing’s ethos is to honor iconic engineers and their work, and they…

Shared by Jonathan Siddharth
Excited to share key takeaways from Jonathan Siddharth's (@jonsidd) insightful presentation yesterday at NeurIPS on "Improving Foundation Models with…

Excited to share key takeaways from Jonathan Siddharth's (@jonsidd) insightful presentation yesterday at NeurIPS on "Improving Foundation Models with…

Liked by Jonathan Siddharth
I am delighted to announce the final guest speaker in my LLM course, Aakanksha Chowdhery, Ph.D. on Monday November 25 at 1:45 pm ET. ➡️ Register…

I am delighted to announce the final guest speaker in my LLM course, Aakanksha Chowdhery, Ph.D. on Monday November 25 at 1:45 pm ET. ➡️ Register…

Liked by Jonathan Siddharth
48 hrs of air travel in 6 days for a team offsite. Inspiration and excitement overcame jet lag while transiting through SFO. ...As I walked by…

48 hrs of air travel in 6 days for a team offsite. Inspiration and excitement overcame jet lag while transiting through SFO. ...As I walked by…

Liked by Jonathan Siddharth
Our CEO & Co-Founder, Jonathan Siddharth delivered an insightful session at the NeurIPS Conference on how expert human data can elevate foundation…

Our CEO & Co-Founder, Jonathan Siddharth delivered an insightful session at the NeurIPS Conference on how expert human data can elevate foundation…

Liked by Jonathan Siddharth

View Jonathan’s full profile

See who you know in common
Get introduced
Contact Jonathan directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Jonathan Siddharth

San Francisco Bay Area 29K followers 500+ connections

About

Activity

I’ve closed more than $500M in contact center AI deals, and here’s the truth that skeptics overlook: the fastest route to ROI in today’s call centers…

Liked by Jonathan Siddharth

Last week, I talked about how clearing out all the threads from our brain before bedtime can help us with deep sleep. Now for the paradox. After…

Liked by Jonathan Siddharth

History made that may never be overwritten. At 18 years, D Gukesh became the youngest chess champion and shattered Gary Kasparov’s record by four…

Liked by Jonathan Siddharth

Experience

-

-

-

-

-

-

-

Education

Publications

ACM SIGIR May 29, 2008

Master's Thesis (Best Thesis Award in Computer Science at Stanford University) June 15, 2007

Machine Learning Course Project (CS229) August 1, 2006

Stanford InfoLab Independent Research under Prof. Hector Garcia-Molina & Dr. Andreas Paepcke January 1, 2006

Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005. January 4, 2005

Senior Thesis Research Report, Best Paper Award at Abacus ’05 National level Tech Symposium at Anna University 2005

Networking and Mobile Computing. Springer Berlin Heidelberg, 2005. 153-162. APA 2005

IEEE International Conference on Cybernetics and Intelligent Systems - CIS 2004 December 1, 2004

Proceedings of the IEEE International Conference on Advances in Intelligent Systems - Theory and Applications -AISTA 2004 2004

Organizations

Plug and Play Startup Camp

Mentor

Stanford IEEE Student Chapter

Officer & Industry Liaison

Stanford India Association

Core Planning Group

Recommendations received

Anastasia (Leitner) Savvina

Lukas Biewald

More activity by Jonathan

Great talk by Ilya Sutskever at NeurIPS. Pre-training data hit a wall since we used up all the internet tokens and synthetic data hasn’t been super…

Shared by Jonathan Siddharth

Turing is having a 💥 at NeurIPS with full house to hear from Jeff Dean at our #AGI icon 3 event. #Turing 🤝 #Google

Liked by Jonathan Siddharth

There are days that are special and unforgettable in one’s life; the day you get married, birthday of your children, the first day of your first…

Liked by Jonathan Siddharth

AGI Icons 3 is live at #neurips2024! #agiicons Turing

Liked by Jonathan Siddharth

I am grateful and delighted to be appointed the Misra Family Professor in the School of Engineering and Applied Science at the University of…

Liked by Jonathan Siddharth

Yesterday was an evening well spent with #AI leaders, practitioners and enthusiasts attending Turing 🔥 side chat panel event moderated by Anna…

Liked by Jonathan Siddharth

Thrilled to share a major milestone for Fiddler AI! We’ve raised an $18.6M Series B extension, bringing our total Series B funding to $50M. This…

Liked by Jonathan Siddharth

Big announcement 📣 - Turing’s next AGI Icons event will feature Jeff Dean 🎉 Turing’s ethos is to honor iconic engineers and their work, and they…

Shared by Jonathan Siddharth

Excited to share key takeaways from Jonathan Siddharth's (@jonsidd) insightful presentation yesterday at NeurIPS on "Improving Foundation Models with…

Liked by Jonathan Siddharth

I am delighted to announce the final guest speaker in my LLM course, Aakanksha Chowdhery, Ph.D. on Monday November 25 at 1:45 pm ET. ➡️ Register…

Liked by Jonathan Siddharth

48 hrs of air travel in 6 days for a team offsite. Inspiration and excitement overcame jet lag while transiting through SFO. ...As I walked by…

Liked by Jonathan Siddharth

Our CEO & Co-Founder, Jonathan Siddharth delivered an insightful session at the NeurIPS Conference on how expert human data can elevate foundation…

Liked by Jonathan Siddharth

View Jonathan’s full profile

Other similar profiles

Ashutosh Garg

Deon Nicholas

Sri Satish Ambati

Francisco Martin

Mahmoud Rusty Abdelkader

Anoop Gupta

Alex Kaveh Senemar

Mohamed Aly

Ash Kumar

Olcay Yilmazcoban

Oren Kaniel

Marc Hadfield

Adel Elmessiry, Ph.D.

Eran Ben-Shushan

San Francisco Bay Area

29K followers 500+ connections