Skip to content

more-optimizations-to-NotGPT-hpo-script #254

@david-thrower

Description

@david-thrower

Tweaks and refactors to

From #253

TO DO:

  • ### Add to text generation samples:

  • temperature=0.75, top_k=75, top_p=0.98, presence_penalty=1.4, frequency_penalty = 1.4

  • temperature=0.7, top_k=75, top_p=0.98, presence_penalty=1.4, frequency_penalty = 1.4

  • temperature=0.6, top_k=75, top_p=0.98, presence_penalty=1.4, frequency_penalty = 1.4

  • ### Add trial number metadata and perplexity to printouts.

Reorder printouts to the format f"Trial #: {trial_number} Text Sample #:
{text_sample_number} generate_params temperature={temperature}, top_k={top_k},
top_p={top_p}, presence_penalty={presence_penalty}
frequency_penalty{frequency_penalty} PROMPT: {prompt} RESPONSE: {}"

  • ### Best Hyperparameters From last Run to start from:
    SAMPLES_TO_CREATE: 681
    PROMPT_LENGTH: 1
    MAX_SEQ_LENGTH: 40
    POSITIONAL_EMBEDDING_DROPOUT: 0.7526060078475657
    activation: gelu
    predecessor_level_connection_affinity_factor_first: 28.222701134894965
    predecessor_level_connection_affinity_factor_main: 13.935296560312914
    max_consecutive_lateral_connections: 6
    p_lateral_connection: 0.20526326890287255
    num_lateral_connection_tries_per_unit: 25
    learning_rate: 0.005739425114415373
    epochs: 65
    batch_size: 10
    gradient_accumulation_steps: 5
    minimum_levels: 2
    maximum_levels: 2
    minimum_units_per_level: 2
    maximum_units_per_level: 3
    minimum_neurons_per_unit: 1
    maximum_neurons_per_unit: 2
    VOCABULARY_SIZE: 128260
    EMBEDDING_N: 9 / EMBEDDING_DIM: 18
    PROJECTION_N: 1

Use gelu, relu, swish, softsign on next runs

  • # Sample number 3 needs replaced with something that is less ambiguous as a text completion
    prompt.

Notes from last HPO Study on NotGPT

2025-10-07--HPO-On-850-samples-WEB-Genesis -1.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions