Skip to content

Commit a4408e2

Browse files
v1.0.1
1 parent 66c3ff0 commit a4408e2

File tree

13 files changed

+126
-349
lines changed

13 files changed

+126
-349
lines changed

.github/workflows/ci.yml

Lines changed: 27 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -33,21 +33,31 @@ jobs:
3333
needs: build
3434
runs-on: ubuntu-latest
3535
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
36+
strategy:
37+
max-parallel: 1
38+
matrix:
39+
python-version: ['3.9']
3640
steps:
37-
- uses: actions/checkout@v4
38-
- name: Configure Git Credentials
39-
run: |
40-
git config user.name github-actions[bot]
41-
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
42-
- uses: actions/setup-python@v5
43-
with:
44-
python-version: '3.10'
45-
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
46-
- uses: actions/cache@v4
47-
with:
48-
key: mkdocs-material-${{ env.cache_id }}
49-
path: .cache
50-
restore-keys: |
51-
mkdocs-material-
52-
- run: pip install -r requirements.txt
53-
- run: mkdocs gh-deploy --force
41+
- uses: actions/checkout@v4
42+
- name: Set up Python ${{ matrix.python-version }}
43+
uses: actions/setup-python@v5
44+
with:
45+
python-version: ${{ matrix.python-version }}
46+
- name: Install Dependencies
47+
run: |
48+
python -m pip install --upgrade pip
49+
pip install -r requirements-dev.txt
50+
- name: Configure Git Credentials
51+
run: |
52+
git config user.name github-actions[bot]
53+
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
54+
- uses: actions/cache@v4
55+
with:
56+
key: mkdocs-material-${{ env.cache_id }}
57+
path: .cache
58+
restore-keys: |
59+
mkdocs-material-
60+
- name: Publish Documentation
61+
run: |
62+
echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
63+
mkdocs gh-deploy --force
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
name: Documentation
2+
3+
on:
4+
release:
5+
types: [created]
6+
7+
jobs:
8+
docs:
9+
runs-on: ubuntu-latest
10+
strategy:
11+
max-parallel: 1
12+
matrix:
13+
python-version: ['3.9']
14+
steps:
15+
- uses: actions/checkout@v4
16+
- name: Set up Python ${{ matrix.python-version }}
17+
uses: actions/setup-python@v5
18+
with:
19+
python-version: ${{ matrix.python-version }}
20+
- name: Install Dependencies
21+
run: |
22+
python -m pip install --upgrade pip
23+
pip install -r requirements-dev.txt
24+
- name: Configure Git Credentials
25+
run: |
26+
git config user.name github-actions[bot]
27+
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
28+
- uses: actions/cache@v4
29+
with:
30+
key: mkdocs-material-${{ env.cache_id }}
31+
path: .cache
32+
restore-keys: |
33+
mkdocs-material-
34+
- name: Publish Documentation
35+
run: |
36+
echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
37+
mkdocs gh-deploy --force

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -251,3 +251,5 @@ A new family of open language models demonstrating strong performance across aca
251251
* Google<end_of_turn>
252252
253253
```
254+
255+
**Read the documentation:** [https://thewebscraping.github.io/gemma-template/](https://thewebscraping.github.io/gemma-template/)

docs/benchmark.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ VMLU is a benchmark suite designed to evaluate foundational models with a focus
4747
|---------------------|:----------------:|:-----:|:--------------:|:----------:|:------:|:-----:|:----------:|
4848
| 1624257089558187281 | 05/01/2025 17:56 | 20.14 | 29.35 | 29.84 | 25.76 | 25.61 | 1497 |
4949

50-
#### Results:
50+
#### Results
5151
* Out of 9,834 attempts, 1,497 responses were unanswered.
5252
* The dataset and evaluation results can be downloaded from the file: `gemma-benchmark/gemma_2b_vmlu_answers.csv`. Although it is not within the scope of this fine tuning.
5353

@@ -57,20 +57,20 @@ VMLU is a benchmark suite designed to evaluate foundational models with a focus
5757
|---------------------|:----------------:|:-----:|:--------------:|:----------:|:------:|:-----:|:----------:|
5858
| 1840435368978448913 | 06/01/2025 19:04 | 36.11 | 43.45 | 41.92 | 39.06 | 39.64 | 82 |
5959

60-
#### Results:
60+
#### Results
6161
* Out of 9,834 attempts, 82 responses were unanswered.
6262
* The dataset and evaluation results can be downloaded from the file: `gemma-benchmark/gemma_2b_it_vmlu_benchmark.csv`. Although it is not within the scope of this fine tuning.
6363

64-
#### My Gemma Fine Tuning VMLU Score:
64+
#### My Gemma Fine Tuning VMLU Score
6565

6666
![Screenshot VMLU_Gemma_Fine_Tuning.png](images/Screenshot_VMLU_Gemma_Fine_Tuning.png)
6767

68-
#### VMLU Leaderboard Score:
68+
#### VMLU Leaderboard
6969
There is a clear difference between the VMLU rankings in the Gemma 2B IT fine tuning, the score is close to the score of the **Gemma 7B IT** model. Here is a screenshot of the **VMLU Leaderboard** rankings:
7070

7171
![Screenshot VMLU_Gemma_Fine_Tuning.png](images/Screenshot_VMLU_Leaderboard.png)
7272

73-
#### Additional Resources:
73+
#### Additional Resources
7474
* VMLU Website: [https://vmlu.ai/](https://vmlu.ai/)
7575
* VMLU Leaderboard: [https://vmlu.ai/leaderboard](https://vmlu.ai/leaderboard)
7676
* VMLU Github Repository: [https://github.com/ZaloAI-Jaist/VMLU/](https://github.com/ZaloAI-Jaist/VMLU/)

docs/custom_templates/custom_template.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Custom Templates to Vietnamese Language
22
Gemma Template uses Jinja2 template.
33

4-
See also: [`models.Attr`](../../models/#attributes_5)
4+
See also: [`models.Attr`](../models.md#attr)
55

66
* * *
77

docs/custom_templates/default_template.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Default Templates
22
Gemma Template uses Jinja2 template.
33

4-
See also: [`models.Attr`](../../models/#attributes_5)
4+
See also: [`models.Attr`](../models.md#attributes_5)
55

66
* * *
77

docs/generate_methods.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,13 @@
66
True
77
```
88

9-
See also: [Method Arguments](../models/#method-arguments)
9+
See also: [Method Arguments](models.md#method-arguments)
1010

1111
## Generate User Prompt
1212
Create user prompt for Gemma Fine tuning.
1313

14+
See also: [Method Arguments](models.md#method-arguments)
15+
1416
!!! Parameters
1517
* **max_hidden_words (Union[int, float]):** default `0`.
1618
* Replace words in the document with '_____'.
@@ -105,6 +107,8 @@ Gemma open models are built from _____ same _____ and technology _____ Gemini mo
105107
## Generate Model Prompt
106108
Create model prompt for Gemma Fine tuning.
107109

110+
See also: [Method Arguments](models.md#method-arguments)
111+
108112
```pycon
109113
>>> prompt = gemma_template.generate_model_prompt(
110114
... document='This is a Test!',
@@ -134,6 +138,8 @@ Test
134138
## Generate Prompt for Question
135139
Quickly create question prompts using the Gemma model.
136140

141+
See also: [Method Arguments](models.md#method-arguments)
142+
137143
```pycon
138144
>>> prompt = gemma_template.generate_prompt(document='This is a Test!')
139145
'''

docs/index.md

Lines changed: 19 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
This library was developed for the Kaggle challenge:
44
[**Google - Unlocking Global Communication with Gemma**](https://www.kaggle.com/competitions/gemma-language-tuning), sponsored by Google.
55

6-
## Credit Requirement
6+
### Credit Requirement
77

88
**Important:** If you are a participant in the competition and wish to use this source code in your submission,
99
you must clearly credit the original author before the competition's end date, **January 14, 2025**.
@@ -17,7 +17,7 @@ GitHub: [https://github.com/thewebscraping/gemma-template/](https://github.com/t
1717
LinkedIn: [https://www.linkedin.com/in/thetwofarm](https://www.linkedin.com/in/thetwofarm)
1818
```
1919

20-
# Overview
20+
## Overview
2121

2222
Gemma Template is a lightweight and efficient Python library for generating templates to fine-tune models and craft prompts.
2323
Designed for flexibility, it seamlessly supports Gemma, LLaMA, and other language frameworks, offering fast, user-friendly customization.
@@ -35,64 +35,62 @@ As a newbie, I created Gemma Template based on what I read and learned from the
3535

3636
Gemma Template supports exporting dataset files in three formats: `Text`, `Alpaca`, and `OpenAI`.
3737

38-
# Multilingual Content Writing Assistant
38+
## Multilingual Content Writing Assistant
3939

4040
This writing assistant is a multilingual professional writer specializing in crafting structured, engaging, and SEO-optimized content.
4141
It enhances text readability, aligns with linguistic nuances, and preserves original context across various languages.
4242

4343
---
4444

45-
## Key Features:
46-
#### 1. **Creative and Engaging Rewrites**
45+
### Key Features:
46+
### 1. **Creative and Engaging Rewrites**
4747
- Transforms input text into captivating and reader-friendly content.
4848
- Utilizes vivid imagery and descriptive language to enhance engagement.
4949

50-
#### 2. **Advanced Text Analysis**
50+
### 2. **Advanced Text Analysis**
5151
- Processes text with unigrams, bigrams, and trigrams to understand linguistic patterns.
5252
- Ensures language-specific nuances and cultural integrity are preserved.
5353

54-
#### 3. **SEO-Optimized Responses**
54+
### 3. **SEO-Optimized Responses**
5555
- Incorporates keywords naturally to improve search engine visibility.
5656
- Aligns rewritten content with SEO best practices for discoverability.
5757

58-
#### 4. **Professional and Multilingual Expertise**
58+
### 4. **Professional and Multilingual Expertise**
5959
- Full support for creating templates in local languages.
6060
- Supports multiple languages with advanced prompting techniques.
6161
- Vocabulary and grammar enhancement with unigrams, bigrams, and trigrams instruction template.
6262
- Supports hidden mask input text. Adapts tone and style to maintain professionalism and clarity.
6363
- Full documentation with easy configuration prompts and examples.
6464

65-
#### 5. **Customize Advanced Response Structure and Dataset Format**
65+
### 5. **Customize Advanced Response Structure and Dataset Format**
6666
- Supports advanced response structure format customization.
6767
- Compatible with other models such as LLaMa.
6868
- Enhances dynamic prompts using Round-Robin loops.
6969
- Outputs multiple formats such as Text, Alpaca and OpenAI.
7070

71-
**Installation**
72-
----------------
71+
## **Installation**
7372

7473
To install the library, you can choose between two methods:
7574

76-
#### **1\. Install via PyPI:**
75+
### **1\. Install via PyPI:**
7776

7877
```shell
7978
pip install gemma-template
8079
```
8180

82-
#### **2\. Install via GitHub Repository:**
81+
### **2\. Install via GitHub Repository:**
8382

8483
```shell
8584
pip install git+https://github.com/thewebscraping/gemma-template.git
8685
```
8786

88-
**Quick Start**
89-
----------------
87+
## **Quickstart**
9088
Start using Gemma Template with just a few lines of code:
9189

92-
## Load Dataset
90+
### Load Dataset
9391
Returns: A Hugging Face Dataset or DatasetDict object containing the processed prompts.
9492

95-
**Load Dataset from data dict**
93+
#### **Load Dataset from data dict**
9694
```python
9795
from gemma_template import gemma_template
9896

@@ -112,7 +110,8 @@ dataset = gemma_template.load_dataset(data_dict, output_format='text') # enum:
112110
print(dataset['text'][0])
113111
```
114112

115-
**Load Dataset from local file path or HuggingFace dataset**
113+
#### **Load Dataset from local file path or HuggingFace dataset**
114+
116115
```python
117116
from gemma_template import gemma_template
118117

@@ -133,7 +132,7 @@ dataset = gemma_template.load_dataset(
133132
)
134133
```
135134

136-
## Fully Customized Template
135+
### Fully Customized Template
137136

138137
```python
139138
from gemma_template import Template, FieldPosition, INPUT_TEMPLATE, OUTPUT_TEMPLATE, INSTRUCTION_TEMPLATE, PROMPT_TEMPLATE
@@ -169,7 +168,7 @@ response = template_instance.apply_template(
169168
print(response)
170169
```
171170

172-
### Output:
171+
#### Output
173172

174173
```text
175174
<start_of_turn>user

0 commit comments

Comments
 (0)