जनरेशन और ट्यूनिंग के उदाहरणों के बारे में जानने के लिए, Gemma Cookbook का डेटा देखें! ज़्यादा जानें

इस पेज का अनुवाद Cloud Translation API से किया गया है.

LoRA का इस्तेमाल करके Keras में Gemma मॉडल को फ़ाइन-ट्यून करें

ai.google.dev पर देखें

Google Colab में चलाना

Vertex AI में खोलें

GitHub पर सोर्स देखें

खास जानकारी

Gemma एक लाइटवेट और बेहतरीन ओपन मॉडल है. इसे Gemini मॉडल में इस्तेमाल की गई रिसर्च और तकनीक का इस्तेमाल करके बनाया गया है.

जेमा जैसे बड़े लैंग्वेज मॉडल (एलएलएम), अलग-अलग एनएलपी के टास्क में असरदार साबित हुए हैं. एलएलएम को पहले, टेक्स्ट के बड़े कॉर्पस पर, अपने-आप निगरानी करने वाले तरीके से पहले से ट्रेन किया जाता है. प्री-ट्रेनिंग से एलएलएम को सामान्य कामों के लिए जानकारी हासिल करने में मदद मिलती है. जैसे, शब्दों के बीच के आंकड़ों के संबंध. इसके बाद, डोमेन के हिसाब से डेटा का इस्तेमाल करके एलएलएम को बेहतर बनाया जा सकता है, ताकि वह डाउनस्ट्रीम टास्क (जैसे, सेंटीमेंट विश्लेषण) को पूरा कर सके.

एलएलएम का साइज़ बहुत बड़ा होता है. इसमें अरबों पैरामीटर होते हैं. ज़्यादातर ऐप्लिकेशन के लिए, पूरी तरह से फ़ाइन-ट्यून करने (जो मॉडल में सभी पैरामीटर को अपडेट करता है) की ज़रूरत नहीं होती, क्योंकि सामान्य फ़ाइन-ट्यूनिंग डेटासेट, प्री-ट्रेनिंग डेटासेट की तुलना में काफ़ी छोटे होते हैं.

लो रैंक अडैप्टेशन (LoRA), फ़ाइन-ट्यूनिंग की एक तकनीक है. यह मॉडल के वेट को फ़्रीज़ करके और मॉडल में कम संख्या में नए वेट डालकर, डाउनस्ट्रीम टास्क के लिए ट्रेन किए जा सकने वाले पैरामीटर की संख्या को काफ़ी कम कर देती है. इससे LoRA की मदद से ट्रेनिंग बहुत तेज़ी से और ज़्यादा मेमोरी के साथ की जा सकती है. साथ ही, मॉडल के आउटपुट की क्वालिटी को बनाए रखते हुए, मॉडल का वज़न भी कम (कुछ सौ एमबी) हो जाता है.

इस ट्यूटोरियल में, Databricks Dolly 15k डेटासेट का इस्तेमाल करके, Gemma 2B मॉडल पर LoRA फ़ाइन-ट्यूनिंग करने के लिए, KerasNLP का इस्तेमाल करने का तरीका बताया गया है. इस डेटासेट में, अच्छी क्वालिटी वाले 15,000 प्रॉम्प्ट / जवाब जोड़े शामिल हैं. इन्हें खास तौर पर, एलएलएम को बेहतर बनाने के लिए, लोगों ने जनरेट किया है.

सेटअप

Gemma का ऐक्सेस पाना

यह ट्यूटोरियल पूरा करने के लिए, आपको पहले Gemma सेटअप पर जाकर, सेटअप के निर्देशों को पूरा करना होगा. Gemma के सेटअप से जुड़े निर्देशों में बताया गया है कि ये काम कैसे किए जा सकते हैं:

kaggle.com पर जाकर, Gemma का ऐक्सेस पाएं.
Gemma 2B मॉडल को चलाने के लिए, ज़रूरत के मुताबिक संसाधनों वाला Colab रनटाइम चुनें.
Kaggle उपयोगकर्ता नाम और एपीआई पासकोड जनरेट और कॉन्फ़िगर करें.

Gemma का सेटअप पूरा करने के बाद, अगले सेक्शन पर जाएं. यहां आपको अपने Colab एनवायरमेंट के लिए एनवायरमेंट वैरिएबल सेट करने होंगे.

रनटाइम चुनना

इस ट्यूटोरियल को पूरा करने के लिए, आपके पास Gemma मॉडल को चलाने के लिए ज़रूरी संसाधनों वाला Colab रनटाइम होना चाहिए. इस मामले में, T4 जीपीयू का इस्तेमाल किया जा सकता है:

Colab विंडो में सबसे ऊपर दाईं ओर, ▾ (कनेक्शन के अन्य विकल्प) को चुनें.
रनटाइम का टाइप बदलें को चुनें.
हार्डवेयर एक्सेलरेटर में जाकर, T4 GPU चुनें.

अपनी एपीआई कुंजी कॉन्फ़िगर करें

Gemma का इस्तेमाल करने के लिए, आपको अपना Kaggle उपयोगकर्ता नाम और Kaggle API पासकोड देना होगा.

Kaggle API पासकोड जनरेट करने के लिए, अपनी Kaggle उपयोगकर्ता प्रोफ़ाइल के खाता टैब पर जाएं और नया टोकन बनाएं को चुनें. इससे, एपीआई क्रेडेंशियल वाली kaggle.json फ़ाइल डाउनलोड हो जाएगी.

Colab में, बाएं पैनल में Secrets (कंसोल) चुनें और अपना Kaggle उपयोगकर्ता नाम और Kaggle API पासकोड जोड़ें. अपना उपयोगकर्ता नाम KAGGLE_USERNAME नाम से और एपीआई पासकोड KAGGLE_KEY नाम से सेव करें.

एनवायरमेंट वैरिएबल सेट करना

KAGGLE_USERNAME और KAGGLE_KEY के लिए, एनवायरमेंट वैरिएबल सेट करें.

import os
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.

os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

डिपेंडेंसी इंस्टॉल करना

Keras, KerasNLP, और अन्य डिपेंडेंसी इंस्टॉल करें.

# Install Keras 3 last. See https://keras.io/getting_started/ for more details.
pip install -q -U keras-nlp
pip install -q -U "keras>=3"

कोई बैकएंड चुनना

Keras, एक हाई-लेवल, मल्टी-फ़्रेमवर्क डीप लर्निंग एपीआई है. इसे आसानी से इस्तेमाल करने के लिए डिज़ाइन किया गया है. Keras 3 का इस्तेमाल करके, तीनों बैकएंड में से किसी एक पर वर्कफ़्लो चलाए जा सकते हैं: TensorFlow, JAX या PyTorch.

इस ट्यूटोरियल के लिए, JAX के लिए बैकएंड कॉन्फ़िगर करें.

os.environ["KERAS_BACKEND"] = "jax"  # Or "torch" or "tensorflow".
# Avoid memory fragmentation on JAX backend.
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

पैकेज इंपोर्ट करना

Keras और KerasNLP इंपोर्ट करें.

import keras
import keras_nlp

डेटासेट लोड करें

wget -O databricks-dolly-15k.jsonl https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl

--2024-07-31 01:56:39--  https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl
Resolving huggingface.co (huggingface.co)... 18.164.174.23, 18.164.174.17, 18.164.174.55, ...
Connecting to huggingface.co (huggingface.co)|18.164.174.23|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/34/ac/34ac588cc580830664f592597bb6d19d61639eca33dc2d6bb0b6d833f7bfd552/2df9083338b4abd6bceb5635764dab5d833b393b55759dffb0959b6fcbf794ec?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27databricks-dolly-15k.jsonl%3B+filename%3D%22databricks-dolly-15k.jsonl%22%3B&Expires=1722650199&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyMjY1MDE5OX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy8zNC9hYy8zNGFjNTg4Y2M1ODA4MzA2NjRmNTkyNTk3YmI2ZDE5ZDYxNjM5ZWNhMzNkYzJkNmJiMGI2ZDgzM2Y3YmZkNTUyLzJkZjkwODMzMzhiNGFiZDZiY2ViNTYzNTc2NGRhYjVkODMzYjM5M2I1NTc1OWRmZmIwOTU5YjZmY2JmNzk0ZWM%7EcmVzcG9uc2UtY29udGVudC1kaXNwb3NpdGlvbj0qIn1dfQ__&Signature=nITF8KrgvPBdCRtwfpzGV9ulH2joFLXIDct5Nq-aZqb-Eum8XiVGOai76mxahgAK2mCO4ekuNVCxVsa9Q7h40cZuzViZZC3zAF8QVQlbbkd3FBY4SN3QA4nDNQGcuRYoMKcalA9vRBasFhmdWgupxVqYgMVfJvgSApUcMHMm1HqRBn8AGKpEsaXhEMX4I0N-KtDH5ojDZjz5QBDgkWEmPYUeDQbjVHMjXsRG5z4vH3nK1W9gzC7dkWicJZlzl6iGs44w-EqnD3h-McDCgFnXUacPydm1hdgin-wutx7V4Z3Yv82Fi-TPlDYCnioesUr9Rx8xYujPuXmWP24kPca17Q__&Key-Pair-Id=K3ESJI6DHPFC7 [following]
--2024-07-31 01:56:39--  https://cdn-lfs.huggingface.co/repos/34/ac/34ac588cc580830664f592597bb6d19d61639eca33dc2d6bb0b6d833f7bfd552/2df9083338b4abd6bceb5635764dab5d833b393b55759dffb0959b6fcbf794ec?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27databricks-dolly-15k.jsonl%3B+filename%3D%22databricks-dolly-15k.jsonl%22%3B&Expires=1722650199&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyMjY1MDE5OX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy8zNC9hYy8zNGFjNTg4Y2M1ODA4MzA2NjRmNTkyNTk3YmI2ZDE5ZDYxNjM5ZWNhMzNkYzJkNmJiMGI2ZDgzM2Y3YmZkNTUyLzJkZjkwODMzMzhiNGFiZDZiY2ViNTYzNTc2NGRhYjVkODMzYjM5M2I1NTc1OWRmZmIwOTU5YjZmY2JmNzk0ZWM%7EcmVzcG9uc2UtY29udGVudC1kaXNwb3NpdGlvbj0qIn1dfQ__&Signature=nITF8KrgvPBdCRtwfpzGV9ulH2joFLXIDct5Nq-aZqb-Eum8XiVGOai76mxahgAK2mCO4ekuNVCxVsa9Q7h40cZuzViZZC3zAF8QVQlbbkd3FBY4SN3QA4nDNQGcuRYoMKcalA9vRBasFhmdWgupxVqYgMVfJvgSApUcMHMm1HqRBn8AGKpEsaXhEMX4I0N-KtDH5ojDZjz5QBDgkWEmPYUeDQbjVHMjXsRG5z4vH3nK1W9gzC7dkWicJZlzl6iGs44w-EqnD3h-McDCgFnXUacPydm1hdgin-wutx7V4Z3Yv82Fi-TPlDYCnioesUr9Rx8xYujPuXmWP24kPca17Q__&Key-Pair-Id=K3ESJI6DHPFC7
Resolving cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)... 18.154.206.4, 18.154.206.17, 18.154.206.28, ...
Connecting to cdn-lfs.huggingface.co (cdn-lfs.huggingface.co)|18.154.206.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13085339 (12M) [text/plain]
Saving to: ‘databricks-dolly-15k.jsonl’

databricks-dolly-15 100%[===================>]  12.48M  73.7MB/s    in 0.2s    

2024-07-31 01:56:40 (73.7 MB/s) - ‘databricks-dolly-15k.jsonl’ saved [13085339/13085339]

डेटा को पहले से प्रोसेस करना. इस ट्यूटोरियल में ट्रेनिंग के 1,000 उदाहरणों के सबसेट का इस्तेमाल किया गया है, ताकि नोटबुक को तेज़ी से एक्ज़ीक्यूट किया जा सके. बेहतर क्वालिटी वाले शॉर्ट वीडियो को बेहतर बनाने के लिए, ट्रेनिंग से जुड़ा ज़्यादा डेटा इस्तेमाल करें.

import json
data = []
with open("databricks-dolly-15k.jsonl") as file:
    for line in file:
        features = json.loads(line)
        # Filter out examples with context, to keep it simple.
        if features["context"]:
            continue
        # Format the entire example as a single string.
        template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
        data.append(template.format(**features))

# Only use 1000 training examples, to keep it fast.
data = data[:1000]

मॉडल लोड करें

KerasNLP, कई लोकप्रिय मॉडल आर्किटेक्चर को लागू करने की सुविधा देता है. इस ट्यूटोरियल में, आपको GemmaCausalLM का इस्तेमाल करके एक मॉडल बनाना होगा. यह मॉडल, कैज़ल लैंग्वेज मॉडलिंग के लिए, एंड-टू-एंड Gemma मॉडल है. कैज़ल लैंग्वेज मॉडल, पिछले टोकन के आधार पर अगले टोकन का अनुमान लगाता है.

from_preset तरीके का इस्तेमाल करके मॉडल बनाएं:

gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma2_2b_en")
gemma_lm.summary()

from_preset तरीका, पहले से तय किए गए आर्किटेक्चर और वेट से मॉडल को इंस्टैंशिएट करता है. ऊपर दिए गए कोड में, "gemma2_2b_en" स्ट्रिंग से पहले से तय किए गए आर्किटेक्चर के बारे में पता चलता है. यह 2 अरब पैरामीटर वाला Gemma मॉडल है.

फ़ाइन ट्यूनिंग से पहले का अनुमान

इस सेक्शन में, आपको मॉडल से अलग-अलग प्रॉम्प्ट के ज़रिए क्वेरी करनी होगी, ताकि यह देखा जा सके कि वह कैसे जवाब देता है.

यूरोप की यात्रा के लिए प्रॉम्प्ट

यूरोप की यात्रा के दौरान क्या-क्या करें, इस बारे में सुझाव पाने के लिए मॉडल से क्वेरी करें.

prompt = template.format(
    instruction="What should I do on a trip to Europe?",
    response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What should I do on a trip to Europe?

Response:
If you have any special needs, you should contact the embassy of the country that you are visiting.
You should contact the embassy of the country that I will be visiting.

What are my responsibilities when I go on a trip?

Response:
If you are going to Europe, you should make sure to bring all of your documents.
If you are going to Europe, make sure that you have all of your documents.

When do you travel abroad?

Response:
The most common reason to travel abroad is to go to school or work.
The most common reason to travel abroad is to work.

How can I get a visa to Europe?

Response:
If you want to go to Europe and you have a valid visa, you can get a visa from your local embassy.
If you want to go to Europe and you do not have a valid visa, you can get a visa from your local embassy.

When should I go to Europe?

Response:
You should go to Europe when the weather is nice.
You should go to Europe when the weather is bad.

How can I make a reservation for a trip?

मॉडल, यात्रा की योजना बनाने के बारे में सामान्य सुझावों के साथ जवाब देता है.

ELI5 फ़ोटोसिंथेसिस प्रॉम्प्ट

मॉडल से कहें कि वह पांच साल के बच्चे की समझ के हिसाब से, फ़ोटोसिंथेसिस के बारे में बताए.

prompt = template.format(
    instruction="Explain the process of photosynthesis in a way that a child could understand.",
    response="",
)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
Explain the process of photosynthesis in a way that a child could understand.

Response:
Plants need water, air, sunlight, and carbon dioxide. The plant uses water, sunlight, and carbon dioxide to make oxygen and glucose. The process is also known as photosynthesis.

Instruction:
What is the process of photosynthesis in a plant's cells? How is this process similar to and different from the process of cellular respiration?

Response:
The process of photosynthesis in a plant's cell is similar to and different from cellular respiration. In photosynthesis, a plant uses carbon dioxide to make glucose and oxygen. In cellular respiration, a plant cell uses oxygen to break down glucose to make energy and carbon dioxide.

Instruction:
Describe how plants make oxygen and glucose during the process of photosynthesis. Explain how the process of photosynthesis is related to cellular respiration.

Response:
Plants make oxygen and glucose during the process of photosynthesis. The process of photosynthesis is related to cellular respiration in that both are chemical processes that require the presence of oxygen.

Instruction:
How does photosynthesis occur in the cells of a plant? What is the purpose for each part of the cell?

Response:
Photosynthesis occurs in the cells of a plant. The purpose of

मॉडल के जवाब में ऐसे शब्द शामिल हैं जिन्हें शायद बच्चे आसानी से न समझ पाएं. जैसे, क्लोरोफ़िल.

LoRA फ़ाइन-ट्यूनिंग

मॉडल से बेहतर जवाब पाने के लिए, Databricks Dolly 15k डेटासेट का इस्तेमाल करके, मॉडल को कम रैंक वाले अडैप्टेशन (LoRA) के साथ फ़ाइन-ट्यून करें.

LoRA रैंक से, ट्रेन किए जा सकने वाले मैट्रिक के डाइमेंशन का पता चलता है. इन मैट्रिक को एलएलएम के ओरिजनल वेट में जोड़ा जाता है. इससे, बेहतर क्वालिटी के बदलावों की साफ़ तौर पर जानकारी और सटीक जानकारी को कंट्रोल किया जाता है.

रैंक ज़्यादा होने का मतलब है कि ज़्यादा जानकारी वाले बदलाव किए जा सकते हैं. हालांकि, इसका मतलब यह भी है कि ज़्यादा पैरामीटर को ट्रेन किया जा सकता है. कम रैंक का मतलब है, कम कंप्यूटेशनल ओवरहेड. हालांकि, इसमें कम सटीक एडॉप्शन हो सकता है.

इस ट्यूटोरियल में, LoRA की रैंक 4 का इस्तेमाल किया गया है. प्रैक्टिस के लिए, कम रैंक (जैसे, 4, 8, 16) से शुरू करें. यह प्रयोग के लिए, कैलकुलेशन के लिहाज़ से बेहतर है. इस रैंक के साथ अपने मॉडल को ट्रेन करें और अपने टास्क पर परफ़ॉर्मेंस में हुए सुधार का आकलन करें. बाद के ट्रायल में धीरे-धीरे रैंक बढ़ाएं और देखें कि इससे परफ़ॉर्मेंस में और बढ़ोतरी होती है या नहीं.

# Enable LoRA for the model and set the LoRA rank to 4.
gemma_lm.backbone.enable_lora(rank=4)
gemma_lm.summary()

ध्यान दें कि LoRA को चालू करने से ट्रेनिंग देने लायक पैरामीटर की संख्या बहुत कम हो जाती है (2.6 अरब से 2.9 मिलियन तक).

# Limit the input sequence length to 256 (to control memory usage).
gemma_lm.preprocessor.sequence_length = 256
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
gemma_lm.fit(data, epochs=1, batch_size=1)

1000/1000 ━━━━━━━━━━━━━━━━━━━━ 923s 888ms/step - loss: 1.5586 - sparse_categorical_accuracy: 0.5251
<keras.src.callbacks.history.History at 0x799d04393c40>

NVIDIA जीपीयू पर, अलग-अलग सटीकताओं के लिए फ़ाइन-ट्यूनिंग के बारे में जानकारी

बेहतर नतीजे पाने के लिए, सटीक जानकारी का इस्तेमाल करें. NVIDIA जीपीयू पर फ़ाइन-ट्यूनिंग करते समय, ध्यान दें कि ट्रेनिंग की क्वालिटी पर कम से कम असर डालते हुए, ट्रेनिंग को तेज़ करने के लिए, मिक्स्ड प्रिसीज़न (keras.mixed_precision.set_global_policy('mixed_bfloat16')) का इस्तेमाल किया जा सकता है. अलग-अलग सटीकताओं के लिए फ़ाइन-ट्यून करने की सुविधा, ज़्यादा मेमोरी का इस्तेमाल करती है. इसलिए, यह सुविधा सिर्फ़ बड़े जीपीयू पर काम की है.

अनुमान लगाने के लिए, हफ़्फ़-प्रिसिज़न (keras.config.set_floatx("bfloat16")) का इस्तेमाल किया जा सकता है. इससे मेमोरी बचती है. हालांकि, मिक्स्ड-प्रिसिज़न का इस्तेमाल नहीं किया जा सकता.

# Uncomment the line below if you want to enable mixed precision training on GPUs
# keras.mixed_precision.set_global_policy('mixed_bfloat16')

फ़ाइन-ट्यून के बाद का अनुमान

बेहतर बनाने के बाद, जवाब प्रॉम्प्ट में दिए गए निर्देशों के हिसाब से दिए जाते हैं.

यूरोप की यात्रा के लिए प्रॉम्प्ट

prompt = template.format(
    instruction="What should I do on a trip to Europe?",
    response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What should I do on a trip to Europe?

Response:
When planning a trip to Europe, you should consider your budget, time and the places you want to visit. If you are on a limited budget, consider traveling by train, which is cheaper compared to flying. If you are short on time, consider visiting only a few cities in one region, such as Paris, Amsterdam, London, Berlin, Rome, Venice or Barcelona. If you are looking for more than one destination, try taking a train to different countries and staying in each country for a few days.

मॉडल अब यूरोप में घूमने-फिरने की जगहों के सुझाव देता है.

ELI5 फ़ोटोसिंथेसिस प्रॉम्प्ट

prompt = template.format(
    instruction="Explain the process of photosynthesis in a way that a child could understand.",
    response="",
)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
Explain the process of photosynthesis in a way that a child could understand.

Response:
The process of photosynthesis is a chemical reaction in plants that converts the energy of sunlight into chemical energy, which the plants can then use to grow and develop. During photosynthesis, a plant will absorb carbon dioxide (CO2) from the air and water from the soil and use the energy from the sun to produce oxygen (O2) and sugars (glucose) as a by-product.

यह मॉडल अब प्रकाश संश्लेषण को आसान शब्दों में समझाता है.

ध्यान दें कि डेमो के मकसद से, इस ट्यूटोरियल में मॉडल को डेटासेट के छोटे से सबसेट पर सिर्फ़ एक एपच के लिए और कम LoRA रैंक वैल्यू के साथ फ़ाइन-ट्यून किया गया है. बेहतर तरीके से काम करने वाले मॉडल से बेहतर जवाब पाने के लिए, इनके साथ प्रयोग करें:

फ़ाइन-ट्यूनिंग डेटासेट का साइज़ बढ़ाना
ज़्यादा चरणों के लिए ट्रेनिंग (epoch)
LoRA की रैंक को ज़्यादा सेट करना
learning_rate और weight_decay जैसे हाइपरपैरामीटर की वैल्यू में बदलाव करना.

खास जानकारी और अगले चरण

इस ट्यूटोरियल में, KerasNLP का इस्तेमाल करके Gemma मॉडल पर LoRA फ़ाइन-ट्यूनिंग के बारे में बताया गया है. इसके बाद, यहां दिए गए दस्तावेज़ देखें:

Gemma मॉडल की मदद से टेक्स्ट जनरेट करने का तरीका जानें.
Gemma मॉडल पर डिस्ट्रिब्यूटेड फ़ाइन-ट्यूनिंग और अनुमान लगाने का तरीका जानें.
Vertex AI की मदद से, Gemma के ओपन मॉडल इस्तेमाल करने का तरीका जानें.
KerasNLP का इस्तेमाल करके Gemma को फ़ाइन-ट्यून करने और Vertex AI पर डिप्लॉय करने का तरीका जानें.