Setu Kumar Basak

Raleigh, North Carolina, United States

1K followers 500+ connections

View mutual connections with Setu Kumar

Welcome back

Email or phone

Password

Forgot password?

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to follow

North Carolina State University

About

I am a Ph.D. student at North Carolina State University, where I work in the Realsearch…

Articles by Setu Kumar

How to upload large files to AWS S3 in ASP.NET web service using Multipart Upload API?

Apr 6, 2019

How to upload large files to AWS S3 in ASP.NET web service using Multipart Upload API?

https://medium.com/@setu677/how-to-upload-large-files-to-aws-s3-in-asp-net-web-service-using-multipart-upload-api-1e83a1…
How to set up CI/CD server using Jenkins connected to a Bitbucket repository for an ASP.NET Core Web API running in an EC2 instance?

Jul 28, 2018

How to set up CI/CD server using Jenkins connected to a Bitbucket repository for an ASP.NET Core Web API running in an EC2 instance?
How to host ASP.NET Core on Linux using Nginx?

Jul 24, 2018

How to host ASP.NET Core on Linux using Nginx?
How to connect to mongodb on aws ec2 instance with Robomongo?

Jul 21, 2018

How to connect to mongodb on aws ec2 instance with Robomongo?
Instantiating interfaces in Java !!!!

Jul 7, 2018

Instantiating interfaces in Java !!!!
How to write implicit Writes for case class having more than 22 fields in Scala?

Jul 7, 2018

How to write implicit Writes for case class having more than 22 fields in Scala?
Java Threading

Nov 25, 2015

Java Threading

Recently i am trying to learn java threading.I have learnt some basics and uploaded it in GitHub .

See all articles

Activity

Tips to solve any DSA question by understanding patterns If the input array is sorted then - Binary search - Two pointers If asked for all…

Tips to solve any DSA question by understanding patterns If the input array is sorted then - Binary search - Two pointers If asked for all…

Liked by Setu Kumar Basak
I am happy to share that our paper "Leveraging Large Language Models to Detect npm Malicious Packages" has been accepted in the ICSE 2025 Research…

I am happy to share that our paper "Leveraging Large Language Models to Detect npm Malicious Packages" has been accepted in the ICSE 2025 Research…

Liked by Setu Kumar Basak
It is my immense pleasure to share that my oral talk received 3rd place in the Graduate Student Award Competition arranged by the AIChE Forest…

It is my immense pleasure to share that my oral talk received 3rd place in the Graduate Student Award Competition arranged by the AIChE Forest…

Liked by Setu Kumar Basak

Join now to see all activity

Experience

North Carolina State University

Raleigh, North Carolina, United States
-

Durham, North Carolina, United States
-

Raleigh, North Carolina, United States
-

Dhaka, Bangladesh
-

Gulshan, Dhaka
-

Dhaka

Education

North Carolina State University

2021 - 2025
2012 - 2016

Core Computer Science Courses:
Computer Basics and programing,Object Oriented Programming, Software Development with Java, Internet Programming, Microprocessor and Assembly Languages, Software Engineering and Information Systems, Data Structure and Algorithms, Algorithm Analysis and Design, Theory of Computation, Computer Architecture, Operating Systems, Database Systems, Compiler Design,Computer Networks, Artificial Intelligence, Computer Graphics, Fault Tolerant Computing etc.

Licenses & Certifications

Project on OWASP: Web Application Incorporating Vulnerabilities

EDUCBA

Issued Jun 2021

Credential ID 9BMHHYW6L

See credential
Web Application Security with OWASP Top 10 - Advanced

EDUCBA

Issued Jun 2021

Credential ID C-GMS09D9

See credential
Web Application Security With OWASP Top 10 - Beginners

EDUCBA

Issued Jun 2021

Credential ID V4YN6YD4-

See credential
Introduction to Software Product Management

Coursera Course Certificates

Issued Jan 2016

Credential ID 9VDYMR6RT882

See credential
Software Processes and Agile Practices

Coursera Course Certificates

Issued Jan 2016

Credential ID 4Z3M5FS9SMLJ

See credential
Java Multithreading

Udemy

Issued Nov 2015

See credential
The Data Scientist’s Toolbox

Coursera

Issued Dec 2014

See credential
Getting Started with Android

Udemy

Issued Sep 2014

See credential
Programming Mobile Applications for Android Handheld Systems

Coursera

Issued Sep 2014
Computer Science 101

Stanford University

Issued Jul 2014

See credential
Relational Algebra

Stanford University

Issued Jun 2014

See credential

Volunteer Experience

Student Motivator

National High School Programing Contest-NHSPC

May 2015 - Present 9 years 8 months

Education

It was an event for spreading the contest programming in high school levels.

Publications

AssetHarvester: A Static Analysis Tool for Detecting Secret-Asset Pairs in Software Artifacts

Research track of 47th International Conference on Software Engineering (ICSE 2025) November 1, 2024
GitGuardian monitored secrets exposure in public GitHub repositories and reported that developers leaked over 12 million secrets (database and other credentials) in 2023, indicating a 113% surge from 2021. Despite the availability of secret detection tools, developers ignore the tools' reported warnings because of false positives (25%-99%). However, each secret protects assets of different values accessible through asset identifiers (a DNS name and a public or private IP address). The asset…

GitGuardian monitored secrets exposure in public GitHub repositories and reported that developers leaked over 12 million secrets (database and other credentials) in 2023, indicating a 113% surge from 2021. Despite the availability of secret detection tools, developers ignore the tools' reported warnings because of false positives (25%-99%). However, each secret protects assets of different values accessible through asset identifiers (a DNS name and a public or private IP address). The asset information for a secret can aid developers in filtering false positives and prioritizing secret removal from the source code. However, existing secret detection tools do not provide the asset information, thus presenting difficulty to developers in filtering secrets only by looking at the secret value or finding the assets manually for each reported secret. The goal of our study is to aid software practitioners in prioritizing secrets removal by providing the assets information protected by the secrets through our novel static analysis tool. We present AssetHarvester, a static analysis tool to detect secret-asset pairs in a repository. Since the location of the asset can be distant from where the secret is defined, we investigated secret-asset co-location patterns and found four patterns. To identify the secret-asset pairs of the four patterns, we utilized three approaches (pattern matching, data flow analysis, and fast-approximation heuristics). We curated a benchmark of 1,791 secret-asset pairs of four database types extracted from 188 public GitHub repositories to evaluate the performance of AssetHarvester. AssetHarvester demonstrates precision of (97%), recall (90%), and F1-score (94%) in detecting secret-asset pairs. Our findings indicate that data flow analysis employed in AssetHarvester detects secret-asset pairs with 0% false positives and aids in improving the recall of secret detection tools.

Other authors
See publication
A Comparative Study of Software Secrets Reporting by Secret Detection Tools

International Symposium on Empirical Software Engineering and Measurement (ESEM 2023) July 3, 2023
According to GitGuardian’s monitoring of public GitHub repositories, secrets sprawl continued accelerating in 2022 by 67% compared to 2021, exposing over 10 million secrets (API keys and other credentials). Though many open-source and proprietary secret detection tools are available, these tools output many false positives, making it difficult for developers to take action and teams to choose one tool out of many. To our knowledge, the secret detection tools are not yet compared and evaluated…

According to GitGuardian’s monitoring of public GitHub repositories, secrets sprawl continued accelerating in 2022 by 67% compared to 2021, exposing over 10 million secrets (API keys and other credentials). Though many open-source and proprietary secret detection tools are available, these tools output many false positives, making it difficult for developers to take action and teams to choose one tool out of many. To our knowledge, the secret detection tools are not yet compared and evaluated. The goal of our study is to aid developers in choosing a secret detection tool to reduce the exposure of secrets through an empirical investigation of existing secret detection tools. We present an evaluation of five open-source and four proprietary tools against a benchmark dataset. The top three tools based on precision are: GitHub Secret Scanner (75%), Gitleaks (46%), and Commercial X (25%), and based on recall are: Gitleaks (88%), SpectralOps (67%) and TruffleHog (52%). Our manual analysis of reported secrets reveals that false positives are due to employing generic regular expressions and ineffective entropy calculation. In contrast, false negatives are due to faulty regular expressions, skipping specific file types, and insufficient rulesets. We recommend developers choose tools based on secret types present in their projects to prevent missing secrets. In addition, we recommend tool vendors update detection rules periodically and correctly employ secret verification mechanisms by collaborating with API vendors to improve accuracy.

Other authors
See publication
SecretBench: A Dataset of Software Secrets

20th International Conference on Mining Software Repositories (MSR 2023) March 14, 2023
According to GitGuardian's monitoring of public GitHub repositories, the exposure of secrets (API keys and other credentials) increased two-fold in 2021 compared to 2020, totaling more than six million secrets. However, no benchmark dataset is publicly available for researchers and tool developers to evaluate secret detection tools that produce many false positive warnings. The goal of our paper is to aid researchers and tool developers in evaluating and improving secret detection tools by…

According to GitGuardian's monitoring of public GitHub repositories, the exposure of secrets (API keys and other credentials) increased two-fold in 2021 compared to 2020, totaling more than six million secrets. However, no benchmark dataset is publicly available for researchers and tool developers to evaluate secret detection tools that produce many false positive warnings. The goal of our paper is to aid researchers and tool developers in evaluating and improving secret detection tools by curating a benchmark dataset of secrets through a systematic collection of secrets from open-source repositories. We present a labeled dataset of source codes containing 97,479 secrets (of which 15,084 are true secrets) of various secret types extracted from 818 public GitHub repositories. The dataset covers 49 programming languages and 311 file types.

Other authors
See publication
What Challenges Do Developers Face About Checked-in Secrets in Software Artifacts?

International Conference on Software Engineering (ICSE) 2023 December 14, 2022
Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. To our knowledge, the challenges developers face to avoid checked-in secrets are not yet characterized. The goal of our paper is to aid researchers and tool developers in understanding and prioritizing opportunities for future research and tool…

Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. To our knowledge, the challenges developers face to avoid checked-in secrets are not yet characterized. The goal of our paper is to aid researchers and tool developers in understanding and prioritizing opportunities for future research and tool automation for mitigating checked-in secrets through an empirical investigation of challenges and solutions related to checked-in secrets. We extract 779 questions related to checked-in secrets on Stack Exchange and apply qualitative analysis to determine the challenges and the solutions posed by others for each of the challenges. We identify 27 challenges and 13 solutions. The four most common challenges, in ranked order, are: (i) store/version of secrets during deployment; (ii) store/version of secrets in source code; (iii) ignore/hide of secrets in source code; and (iv) sanitize VCS history. The three most common solutions, in ranked order, are: (i) move secrets out of source code/version control and use template config file; (ii) secret management in deployment; and (iii) use local environment variables. Our findings indicate that the same solution has been mentioned to mitigate multiple challenges. However, our findings also identify an increasing trend in questions lacking accepted solutions substantiating the need for future research and tool automation on managing secrets.

Other authors
See publication
What are the Practices for Secret Management in Software Artifacts?

IEEE Secure Development Conference (SecDev) 2022 October 20, 2022
Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. A systematic derivation of practices for managing secrets can help practitioners in secure development. The goal of our paper is to aid practitioners in avoiding the exposure of secrets by identifying secret management practices in software…

Throughout 2021, GitGuardian's monitoring of public GitHub repositories revealed a two-fold increase in the number of secrets (database credentials, API keys, and other credentials) exposed compared to 2020, accumulating more than six million secrets. A systematic derivation of practices for managing secrets can help practitioners in secure development. The goal of our paper is to aid practitioners in avoiding the exposure of secrets by identifying secret management practices in software artifacts through a systematic derivation of practices disseminated in Internet artifacts. We conduct a grey literature review of Internet artifacts, such as blog articles and question and answer posts. We identify 24 practices grouped in six categories comprised of developer and organizational practices. Our findings indicate that using local environment variables and external secret management services are the most recommended practices to move secrets out of source code and to securely store secrets. We also observe that using version control system scanning tools and employing short-lived secrets are the most recommended practices to avoid accidentally committing secrets and limit secret exposure, respectively.

Other authors
See publication

Projects

Denticon

Nov 2017 - Aug 2021

Planet DDS is the established leader in cloud-based dental software. The company’s Denticon practice management software is a powerful, flexible tool trusted by thousands of dental professionals across the country. Built from the ground up for enterprise groups, yet intuitive enough for solo practices.

See project
HouseLens

Dec 2019 - Jun 2020

HouseLens is the nation's leading provider of visual marketing services for real estate, with a nationwide footprint to serve everyone from individual agents to the largest listing portals. Our product mix is constantly evolving to keep our customers at the forefront of marketing technology. Offerings include professional photography, walk-through video, interactive 3D models, VR, drones, floor plans, and more

See project
NOMA

May 2016 - Nov 2017

New Opportunity Model Analysis (NOMA) is a project of Foodbuy.

Foodbuy, LLC is the foodservice industry’s leading procurement services organization focused on lowering purchasing and product costs for both our parent company, Compass Group, as well as for our clients and members.

Foodbuy negotiates and contracts for more than $20bn of food, beverages, and services that our clients need, utilizing more than 600 leading manufacturers and distributors across the U.S. Ultimately…

New Opportunity Model Analysis (NOMA) is a project of Foodbuy.

Foodbuy, LLC is the foodservice industry’s leading procurement services organization focused on lowering purchasing and product costs for both our parent company, Compass Group, as well as for our clients and members.

Foodbuy negotiates and contracts for more than $20bn of food, beverages, and services that our clients need, utilizing more than 600 leading manufacturers and distributors across the U.S. Ultimately, sourcing is at the heart of what we do on behalf of Compass Group and some of the most recognized organizations within the restaurant, healthcare, hospitality, leisure, and entertainment industries.

See project

Languages

English

Professional working proficiency
Bangla

Native or bilingual proficiency

Organizations

SGIPC

Assistant Contest Manager

SGIPC(Special Group of Interest in Programming Contest) (zip'c) is a group of programmer of KUET which mainly focuses on programming contest. Despite programming contest this group covers many aspects related to programming. It arranges programming contest in KUET and also discussion session and workshop regularly.

More activity by Setu Kumar

I am glad to share that our paper, "AssetHarvester: A Static Analysis Tool for Detecting Secret-Asset Pairs in Software Artifacts" has been accepted…

I am glad to share that our paper, "AssetHarvester: A Static Analysis Tool for Detecting Secret-Asset Pairs in Software Artifacts" has been accepted…

Posted by Setu Kumar Basak
🚨 Exciting New Internship Job Openings 🚨 I have handpicked 20 roles for you - don’t miss out on these amazing opportunities. 🎓 Degree Level:…

🚨 Exciting New Internship Job Openings 🚨 I have handpicked 20 roles for you - don’t miss out on these amazing opportunities. 🎓 Degree Level:…

Liked by Setu Kumar Basak
Not technical enough -- that's what they said. I remember walking away from those interview rounds feeling pretty crushed. Not technical? How…

Not technical enough -- that's what they said. I remember walking away from those interview rounds feeling pretty crushed. Not technical? How…

Liked by Setu Kumar Basak
https://lnkd.in/e_pjTNx2 Hiring one Postdoctoral Research Associate for my lab. Ideal candidate should have hands on experience on dynamics…

https://lnkd.in/e_pjTNx2 Hiring one Postdoctoral Research Associate for my lab. Ideal candidate should have hands on experience on dynamics…

Liked by Setu Kumar Basak

View Setu Kumar’s full profile

See who you know in common
Get introduced
Contact Setu Kumar directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses

See all courses

Setu Kumar Basak

Raleigh, North Carolina, United States 1K followers 500+ connections

About

Articles by Setu Kumar

How to upload large files to AWS S3 in ASP.NET web service using Multipart Upload API?

How to set up CI/CD server using Jenkins connected to a Bitbucket repository for an ASP.NET Core Web API running in an EC2 instance?

How to host ASP.NET Core on Linux using Nginx?

How to connect to mongodb on aws ec2 instance with Robomongo?

Instantiating interfaces in Java !!!!

How to write implicit Writes for case class having more than 22 fields in Scala?

Java Threading

Activity

Tips to solve any DSA question by understanding patterns If the input array is sorted then - Binary search - Two pointers If asked for all…

Liked by Setu Kumar Basak

I am happy to share that our paper "Leveraging Large Language Models to Detect npm Malicious Packages" has been accepted in the ICSE 2025 Research…

Liked by Setu Kumar Basak

It is my immense pleasure to share that my oral talk received 3rd place in the Graduate Student Award Competition arranged by the AIChE Forest…

Liked by Setu Kumar Basak

Experience

-

-

-

-

-

Education

Licenses & Certifications

Programming Mobile Applications for Android Handheld Systems

Volunteer Experience

Student Motivator

National High School Programing Contest-NHSPC

Publications

Research track of 47th International Conference on Software Engineering (ICSE 2025) November 1, 2024

International Symposium on Empirical Software Engineering and Measurement (ESEM 2023) July 3, 2023

20th International Conference on Mining Software Repositories (MSR 2023) March 14, 2023

International Conference on Software Engineering (ICSE) 2023 December 14, 2022

IEEE Secure Development Conference (SecDev) 2022 October 20, 2022

Projects

Nov 2017 - Aug 2021

Dec 2019 - Jun 2020

May 2016 - Nov 2017

Languages

English

Professional working proficiency

Bangla

Native or bilingual proficiency

Organizations

SGIPC

Assistant Contest Manager

More activity by Setu Kumar

I am glad to share that our paper, "AssetHarvester: A Static Analysis Tool for Detecting Secret-Asset Pairs in Software Artifacts" has been accepted…

Posted by Setu Kumar Basak

🚨 Exciting New Internship Job Openings 🚨 I have handpicked 20 roles for you - don’t miss out on these amazing opportunities. 🎓 Degree Level:…

Liked by Setu Kumar Basak

Not technical enough -- that's what they said. I remember walking away from those interview rounds feeling pretty crushed. Not technical? How…

Liked by Setu Kumar Basak

https://lnkd.in/e_pjTNx2 Hiring one Postdoctoral Research Associate for my lab. Ideal candidate should have hands on experience on dynamics…

Liked by Setu Kumar Basak

View Setu Kumar’s full profile

Other similar profiles

Khandakar Jahid Hasan Sajeeb

Nafis Islam

Ehsan Ul Haque

Aabir Hassan

Nazimuddin Gazi

MD. Kamrul Hasan Shahed

Abdur Rahim

Shagor Hasan

Adnan Foysal

Md. Abul Kalam

Asif Uddin

Md. Samiul Alam

Redoan Ur Rahman

Nayeem Bin Ahsan

Amjad Hossain

Arefeen Rahman Niloy

Anamul Kabir

Syed Saniul Ahsan

Nazmul Hasan Robin

Nafiul Islam

Explore more posts

Raleigh, North Carolina, United States

1K followers 500+ connections