Skip to main content

Parameters Reference

Complete guide to all configuration parameters available when testing AI models.


Overview

Control AI model behavior through these key parameters:

ParameterRangeDefaultImpact
Temperature0.0 - 2.00.7Creativity vs consistency
Max Tokens1 - 4096500Response length limit
Top P0.0 - 1.01.0Token selection diversity
Frequency Penalty-2.0 - 2.00.0Reduce repetition
Presence Penalty-2.0 - 2.00.0Encourage topic diversity

Temperature

Controls randomness/creativity of responses

Range: 0.0 - 2.0

0.0 ━━━━━━━━━━━ 0.7 ━━━━━━━━━━━ 2.0
Deterministic DEFAULT Random

When to Use

  • 0.0 - 0.3: Facts, data extraction, classification
  • 0.4 - 0.9: General use, customer support, content
  • 1.0 - 2.0: Creative writing, brainstorming

See: Temperature Guide for detailed guidance


Max Tokens

Limits the maximum length of the response

Range: 1 - 4096 (varies by model)

Understanding Tokens:

  • 1 token ≈ 4 characters
  • 1 token ≈ 0.75 words
  • 100 tokens ≈ 75 words

Common Settings

SettingTokensWordsUse Case
Short100~75Quick answers, classifications
Medium500~375Standard responses
Long1000~750Detailed explanations
Very Long2000~1500Articles, reports

Example

{
"maxTokens": 500,
"prompt": "Explain quantum computing"
}

Response: Will stop at ~375 words or 500 tokens, whichever comes first.

Best Practices

Set appropriate limits:

  • Short answers: 100-300 tokens
  • Standard: 300-800 tokens
  • Detailed: 800-2000 tokens

Consider cost:

  • Fewer tokens = lower cost
  • Only pay for tokens you need

Avoid:

  • Setting too low (cuts off mid-sentence)
  • Setting unnecessarily high (wastes cost)

Top P (Nucleus Sampling)

Controls diversity of token selection

Range: 0.0 - 1.0

How It Works:

  • Selects from the top P probability mass
  • Alternative to temperature
  • More predictable than temperature at extremes
Top P = 1.0: Consider all tokens
Top P = 0.9: Consider top 90% probability mass
Top P = 0.5: Consider top 50% (most likely tokens)
Top P = 0.1: Consider top 10% (very focused)

When to Use

Top PBehaviorUse Case
1.0All tokens consideredMaximum diversity
0.9Top 90%DEFAULT - Balanced
0.5Top 50%Focused responses
0.1Top 10%Very conservative

Example

{
"temperature": 0.8,
"topP": 0.9
}

Result: Creative but still reasonable

Temperature vs Top P

General Rule: Use one or the other, not both at extreme values

// ✅ Good combinations
{temperature: 0.7, topP: 1.0}
{temperature: 1.0, topP: 0.9}

// ❌ Avoid
{temperature: 0.0, topP: 0.1} // Too restrictive
{temperature: 2.0, topP: 1.0} // Too random

Frequency Penalty

Reduces repetition of tokens based on frequency

Range: -2.0 - 2.0

How It Works:

  • Positive: Decreases likelihood of repeated tokens
  • Negative: Increases likelihood of repeated tokens
  • 0: No penalty (default)
-2.0 ━━━━━━━━━ 0.0 ━━━━━━━━━ 2.0
More repetition DEFAULT Less repetition

When to Use

ValueEffectUse Case
0.0No penaltyDEFAULT
0.5 - 1.0Gentle reductionAvoid minor repetition
1.0 - 2.0Strong reductionPrevent repetition
-0.5 - -1.0Encourage patternsTechnical writing, lists

Example

{
"frequencyPenalty": 0.5,
"prompt": "List the benefits of AI"
}

Without Penalty:

AI offers benefits like efficiency. The efficiency of AI...
AI provides efficiency... The efficiency benefits...

With Penalty (0.5):

AI offers benefits like efficiency, accuracy, and scalability.
These advantages include cost reduction and improved decision-making...

Best Practices

✅ Use 0.3 - 0.8 to reduce repetition
✅ Increase if you see repeated phrases
❌ Don't use extreme values unless necessary


Presence Penalty

Encourages topic diversity based on presence

Range: -2.0 - 2.0

How It Works:

  • Positive: Penalizes tokens that have appeared
  • Negative: Rewards tokens that have appeared
  • 0: No penalty (default)

Difference from Frequency Penalty:

  • Frequency: "How often?" (counts occurrences)
  • Presence: "Has it appeared?" (binary)
-2.0 ━━━━━━━━━ 0.0 ━━━━━━━━━ 2.0
Narrow focus DEFAULT Diverse topics

When to Use

ValueEffectUse Case
0.0No penaltyDEFAULT
0.3 - 0.8Gentle diversityExplore related topics
0.8 - 2.0Strong diversityBrainstorming, ideation
-0.5 - -1.0Focus on themeStay on topic

Example

{
"presencePenalty": 0.6,
"prompt": "Discuss AI applications"
}

Without Penalty:

AI in healthcare... healthcare applications... healthcare benefits...
more healthcare examples... healthcare future...

With Penalty (0.6):

AI in healthcare, education, finance, and manufacturing.
Each sector benefits differently... Transportation uses AI for...

Best Practices

✅ Use 0.3 - 0.8 for topic diversity
✅ Combine with frequency penalty for best results
❌ Extreme values can make responses incoherent


Stop Sequences

Defines tokens where the model should stop generating

Format: Array of strings

Use Cases:

  • End at specific markers
  • Control output structure
  • Prevent unwanted continuation

Example

{
"stopSequences": ["\n\n", "###", "END"],
"prompt": "List 3 items:\n1."
}

Stops at: Double newline, ###, or END

Common Stop Sequences

// Stop at double newline
stopSequences: ["\n\n"]

// Stop at section markers
stopSequences: ["###", "---", "END"]

// Stop at specific phrases
stopSequences: ["The End", "Conclusion:", "Summary:"]

Parameter Combinations

For Different Use Cases

Factual Q&A

{
"temperature": 0.2,
"maxTokens": 300,
"topP": 1.0,
"frequencyPenalty": 0.0,
"presencePenalty": 0.0
}

Customer Support

{
"temperature": 0.7,
"maxTokens": 500,
"topP": 0.9,
"frequencyPenalty": 0.3,
"presencePenalty": 0.2
}

Creative Writing

{
"temperature": 1.0,
"maxTokens": 1000,
"topP": 0.95,
"frequencyPenalty": 0.5,
"presencePenalty": 0.6
}

Code Generation

{
"temperature": 0.3,
"maxTokens": 800,
"topP": 1.0,
"frequencyPenalty": 0.0,
"presencePenalty": 0.0
}

Testing Strategy

Step 1: Start with Defaults

{
"temperature": 0.7,
"maxTokens": 500,
"topP": 1.0,
"frequencyPenalty": 0.0,
"presencePenalty": 0.0
}

Step 2: Adjust One at a Time

If responses are:

  • Too random → Decrease temperature
  • Too repetitive → Increase frequency penalty
  • Too focused → Increase presence penalty
  • Too long → Decrease max tokens

Step 3: Fine-Tune

Test variations and measure:

  • Quality
  • Consistency
  • Cost
  • User satisfaction

Quick Reference

Parameter Cheat Sheet

TEMPERATURE        FREQUENCY PENALTY     PRESENCE PENALTY
0.0-0.3: Facts 0.0: Default 0.0: Default
0.4-0.9: General 0.3-0.8: Reduce reps 0.3-0.8: Topic diversity
1.0-2.0: Creative >1.0: Strong avoid >1.0: Explore widely

MAX TOKENS TOP P STOP SEQUENCES
100-300: Short 0.1-0.5: Focused Custom markers
300-800: Standard 0.9-1.0: Diverse End indicators
800-2000: Long 1.0: All considered Section breaks


Master these parameters to fine-tune AI responses for your specific needs.