Bingyue P.’s Post

Head of Monetization GenAI at ByteDance

Introduce Infinity - the latest text to image foundation model which substantially lift the upper limits of autoregressive models by large margins in visual generation. Infinity is build on the top of our previous research VAR (Neurips 2024 Best Paper), and redefines visual autoregressive model under a bitwise token prediction framework with an infinite-vocabulary tokenizer & classifier and bitwise self-correction mechanism, remarkably improving the generation capacity and details. Without extra optimization, Infinity generates a high-quality 1024×1024 image in 0.8 seconds, making it as the fastest text-to-image model without distillation. Infinity achieves a very high win-rate compared to other top autoregressive models, and match or beat leading diffusion models. Watch our GitHub and we are going to open source both models and weights in the next couple of weeks: https://lnkd.in/g9phQ6F7

To view or add a comment, sign in

More Relevant Posts

Qdrant

29,526 followers
5mo Edited
Report this post
There is a need for clarification regarding the BM42 topic. BM42 is not a concrete model but rather a new approach to generating sparse vectors. This approach unlocks flexible mechanics to tune sparse representation for better performance. We created an example model with the simplest possible implementation to showcase it. Indeed, this approach can be used with different transformer models, different score extraction, and different lemmatizers. Although we haven't yet found the best configuration, we believe that BM42's core assumptions are valid and can improve retrieval quality. Our initial benchmark, which has been published in the original article, proved to have major mistakes. We were too fast with publishing it and making the announcement. Proper benchmark against BM25 shows that there is much room for improvement. A huge thanks to the community for pointing this out. 🙏 In our latest experiments, we compared BM42 with BM25 having the same preprocessing pipeline, which demonstrates that BM42 can still boost the performance, although not as significantly as originally anticipated. Benchmark code is open for reproduction and review https://lnkd.in/d9fTZKAu

GitHub - qdrant/bm42_eval: Evaluation of bm42 sparse indexing algorithm

github.com

12 Comments
Like Comment
To view or add a comment, sign in
Natthanan Bhukan

Machine Learning Engineer | MLOps | Computer vision | K8s
3w
Report this post
Hi I would like to share my side project https://lnkd.in/g2xPNhZt 🚀 Efficient Object Detection with YOLOv11 using CUDA and TensorRT 🖥️ YOLOv11 with CUDA and TensorRT! 🚀 This high-performance object detection pipeline leverages CUDA Streams, multi-threading, and TensorRT for blazing-fast inference, processing images and videos concurrently. With optimized preprocessing, batch inference, and efficient NMS postprocessing Key Features: - Multi-threaded Parallelism: Leveraging CUDA streams for concurrent processing of multiple images and video frames. - Batch Inference: Optimized for processing multiple inputs simultaneously, maximizing GPU throughput. - High-Speed Preprocessing: Fast resizing, normalization, and data transformation. - Non-Maximum Suppression (NMS): Efficient postprocessing to ensure clean bounding box predictions. - Scalable Input Support: Handles both image and video files with simple CLI input.
1 Comment
Like Comment
To view or add a comment, sign in
Bishal Das

Senior Software Engineer @LTIMindtree | Ex-Cisco
7mo
Report this post
🌳 **Day 137: Find the Kth Smallest Element in a Binary Search Tree (BST)** **Understanding the Problem:** We are given a binary search tree (BST) and an integer K. We need to find the Kth smallest element in the BST. Since the BST is structured in a way that an inorder traversal yields the nodes in sorted order, we can leverage this property. **Approach:** To solve this problem, we use an inorder traversal to visit the nodes in sorted order. We keep a counter to track the number of nodes visited and stop once we reach the Kth node. **Implementation:** 1. **Helper Function for Inorder Traversal:** - Define a helper function `solve` that performs an inorder traversal of the BST. - Use a counter to track the number of nodes visited. - If the counter matches K, set the answer to the current node's value. 2. **Main Function to Find Kth Smallest Element:** - Initialize the answer and counter variables. - Call the helper function to start the inorder traversal. - Return the answer once the traversal is complete. **Steps:** 1. Traverse the BST in inorder fashion. 2. Increment the counter each time a node is visited. 3. Once the counter matches K, capture the node's value as the answer. 4. Continue traversal until the Kth node is found. **Complexity Analysis:** - Time Complexity: O(N), where N is the number of nodes in the BST. In the worst case, we might need to visit all nodes. - Space Complexity: O(H), where H is the height of the BST. This space is used for the recursive call stack. This approach efficiently finds the Kth smallest element in a binary search tree. 🌟 #BinarySearchTree #KthSmallestElement #DSADay137 #AlgorithmInsights
Like Comment
To view or add a comment, sign in
Jivjot Singh

From Galaxies to GANs • Aspiring ML Researcher • PA @ IIT Kanpur • RA @ IIIT Delhi • BS-MS Physics @ IISER Mohali
9mo Edited
Report this post
Greetings Everyone !! Amidst a windy weekend afternoon, bringing to you yet another GAN variant, Pix2Pix trained on some Golden cuteness 💕 and devilish Chihuahuas 👿 (this is not up for debate) among other breeds. Apart from its predecessors, where the goal of generator was to produce real-looking images from a latent noise vector, this network learns a mapping from input to output image. The Generator earlier was a traditional CNN with slight changes, but here it is the famous U-Net architecture, which is almost similar to the Encoder-Decoder along with introducing Residual connections between corresponding layers that lie symmetric about the bottleneck code state. To be specific, the same resolution'd image during down-sampling is concatenated with the up-sampled one along the channels dimension. And for the loss driving the gradients, the authors added a L2 term to keep the generated output straying away from the ground truth (original image) Coming to the Discriminator, it was earlier tasked with deciding the probability of image being real (single sigmoid value). Here, it is modified to return, A: raw logits and B: across each N-sized patch in the image. This way feels more intuitive as it would be easier to classify a local structure as real/fake rather than doing it in a one-shot global way. This is why the adversary has been named as Patch-GAN. In the current experiment, I have taken the Stanford Dogs dataset and applied Color Jitter and Gaussian Blur using Torchvision on a subset of ~700 images, passing them as a noisy input and evaluating the performance of generated ones against the originals. The network was trained for 500 epochs with a batch size of 32 and a 2e-4 learning rate, which took less than 5 hrs on a RTX-4060 and by then had achieved quite reasonable reconstructions of dogs 🤗
Like Comment
To view or add a comment, sign in
Tanat Tonguthaisri, CISSP®

enabling digital services for Student Loan related activities while maintaining the highest security standard, the most compliant personal data protection and customer-centric data-driven innovation.
9mo
Report this post
🌟 New Research Announcement! 🌟 Exciting news on the convergence properties of convex message passing algorithms in graphical models. This latest blog post presents a novel proof technique to prove convergence to a fixed point, with the added achievement of precision $\varepsilon\>0$ in $\mathcal\{O\}$1/\varepsilon$$ iterations. Read the full article here: https://bit.ly/3ICai5k \#SocialMediaMarketing \#ResearchUpdate \#ConvexAlgorithms \#GraphicalModels \#NewResearch \#ConvergenceProperties
Like Comment
To view or add a comment, sign in
Otman MECHBAL GRACIA

Interrested in Building Tomorrow's Sustainable Innovation 🌍 | Multilingual Tech Explorer ⚡️ | Community Builder 🤝 | ❤️ Sports
8mo
Report this post
Nice AI free training about quantizing open source multimodal and language models 😃

Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI
8mo

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W
Like Comment
To view or add a comment, sign in
Dr. Massoud Massoudi

Founder Brainyhub | Assistant Professor | AI Healthcare | Machine Learning | Deep Learning | Manager | Fitness & Nutrition Enthusiast 🍏🥥🍇🍒🫐🫑🍗🥗
8mo
Report this post
Learn how to quantize machine learning models.

Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI
8mo

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W
Like Comment
To view or add a comment, sign in
Illya Prikhodko

Beginners bootcamp networking course for bachelors at UNLV
8mo
Report this post
When In my thesis I make the most insignificant calculus formulations as a dichotomous stance with momentum of decay to acclimate to the way it's integrated in certain integrals in ratio format which would make the formulations more conditional pacificational variant to Withstand the basis of detriant axioms as anti judgementalism to become an organic variable that makes a lifeline, rather as pro integrational or anti integrated upon to consider how reusage of cellular respiration becomes immunity superior within the alternative equational substantial accumulational features that reduce the speed instead of thereof inhumane confined abilities to reach a threshold limitations

Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI
8mo

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W
Like Comment
To view or add a comment, sign in
Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI
8mo
Report this post
LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W

143 Comments
Like Comment
To view or add a comment, sign in
Chirag Raj

Governance | Product Management | FinTech | E-Invoice
8mo
Report this post
This will contribute in making the GenAI race somewhat sustainable with reduced carbon footprint.

Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI
8mo

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable performance. In our new short course Quantization Fundamentals taught by Hugging Face's Younes Belkada and Marc Sun, you'll: - Learn how to quantize nearly any open source model - Use int8 and bfloat16 (Brain float 16) data types to load and run LLMs using PyTorch and the Hugging Face Transformers library - Dive into the technical details of linear quantization to map 32-bit floats to 8-bit integers As models get bigger and bigger, quantization becomes more important for making models practical and accessible. Please check out the course here: https://lnkd.in/g66yNW8W
Like Comment
To view or add a comment, sign in

1,424 followers

5 Posts

View Profile Follow

Bingyue P.’s Post

More Relevant Posts

Explore topics