Weighted N-gram Keyword Extractor

API Overview

Our RESTful API allows you to integrate advanced N-gram keyword extraction directly into your applications. Built with performance and flexibility in mind, you can analyze text, extract meaningful keywords, and gain valuable insights programmatically.

Simple Integration

RESTful endpoints with JSON responses make integration straightforward in any language.

Highly Configurable

Customize extraction with the same powerful settings available in our web interface.

Real-time Processing

Fast response times even for large text documents with optimized processing.

API Versions

Version	Status	Base URL	Released
v2 (Current)	Stable	`https://api.ngrams.org/v2/`	April 2025
v1	Deprecated	`https://api.ngrams.org/v1/`	June 2024

V1 will be sunset on December 31, 2025. We recommend migrating to v2 as soon as possible.

Authentication

All API requests require authentication using an API key. You can obtain an API key from your dashboard after signing up for an account.

API Key Authentication

Pass your API key in the request header:

HTTP Header

X-API-Key: your_api_key_here

Example Request with Authentication

cURL

curl -X POST "https://api.ngrams.org/v2/extract" \
-H "X-API-Key: your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
  "text": "Natural language processing is a subfield of linguistics, computer science, and artificial intelligence.",
  "n_gram_size": 2,
  "max_results": 10
}'

Security Best Practices

Keep your API key secure and never expose it in client-side code
Rotate your API key periodically through your dashboard
Set up IP restrictions for your API key in the dashboard settings

Rate Limits

To ensure fair usage and system stability, the API enforces rate limits based on your subscription plan.

Plan	Requests per Minute	Requests per Day	Text Size Limit
Free	10	1,000	10,000 characters
Basic	60	10,000	100,000 characters
Professional	300	100,000	500,000 characters
Enterprise	Custom	Custom	Custom

Rate Limit Headers

Each API response includes headers to help you track your rate limit usage:

HTTP Headers

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1656427200

Handling Rate Limits

When a rate limit is exceeded, the API returns a 429 Too Many Requests response. We recommend implementing exponential backoff in your applications to handle rate limiting gracefully.

JavaScript

async function fetchWithRetry(url, options, maxRetries = 5) {
  let retries = 0;
  
  while (retries < maxRetries) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        // Get retry-after header or use exponential backoff
        const retryAfter = response.headers.get('Retry-After') || Math.pow(2, retries);
        console.log(`Rate limited. Retrying in ${retryAfter} seconds...`);
        
        // Wait for the specified time
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        retries++;
        continue;
      }
      
      return response;
    } catch (error) {
      if (retries === maxRetries - 1) throw error;
      retries++;
    }
  }
}

API Endpoints

Our API offers several endpoints to analyze text and extract keywords with various options.

POST /v2/extract

Extract weighted N-gram keywords from a text input.

Request Parameters

Parameter	Type	Required	Description
text	string	Yes	The text to analyze and extract keywords from.
n_gram_size	integer	No	Size of N-grams to extract (1-4). Default: 2
max_results	integer	No	Maximum number of keywords to return (1-100). Default: 20
min_frequency	integer	No	Minimum frequency threshold for keywords. Default: 2
use_stopwords	boolean	No	Whether to filter out common stopwords. Default: true
custom_stopwords	array	No	Custom stopwords to include in filtering.
use_stemming	boolean	No	Whether to apply word stemming. Default: false
case_sensitive	boolean	No	Whether to treat casing as significant. Default: false
weights	object	No	Customized weighting factors (frequency, position, pos).

Example Request

curl -X POST "https://api.ngrams.org/v2/extract" \
-H "X-API-Key: your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
  "text": "Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.",
  "n_gram_size": 2,
  "max_results": 10,
  "weights": {
    "frequency": 1.0,
    "position": 0.8,
    "pos": 1.2
  }
}'

const options = {
  method: 'POST',
  headers: {
    'X-API-Key': 'your_api_key_here',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.',
    n_gram_size: 2,
    max_results: 10,
    weights: {
      frequency: 1.0,
      position: 0.8,
      pos: 1.2
    }
  })
};

fetch('https://api.ngrams.org/v2/extract', options)
  .then(response => response.json())
  .then(data => console.log(data))
  .catch(error => console.error('Error:', error));

import requests

url = "https://api.ngrams.org/v2/extract"
headers = {
    "X-API-Key": "your_api_key_here",
    "Content-Type": "application/json"
}
payload = {
    "text": "Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.",
    "n_gram_size": 2,
    "max_results": 10,
    "weights": {
        "frequency": 1.0,
        "position": 0.8,
        "pos": 1.2
    }
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()
print(data)

$url = 'https://api.ngrams.org/v2/extract';
$data = [
    'text' => 'Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.',
    'n_gram_size' => 2,
    'max_results' => 10,
    'weights' => [
        'frequency' => 1.0,
        'position' => 0.8,
        'pos' => 1.2
    ]
];

$options = [
    'http' => [
        'header' => "Content-type: application/json\r\nX-API-Key: your_api_key_here\r\n",
        'method' => 'POST',
        'content' => json_encode($data)
    ]
];

$context = stream_context_create($options);
$result = file_get_contents($url, false, $context);
$response = json_decode($result, true);
print_r($response);

Example Response

JSON

{
  "status": "success",
  "request_id": "req_7f5d8a923e1b4c8",
  "processing_time_ms": 87,
  "results": {
    "keywords": [
      {
        "text": "natural language",
        "weight": 8.76,
        "frequency": 3,
        "position_score": 0.92,
        "pos_score": 1.3
      },
      {
        "text": "language processing",
        "weight": 7.54,
        "frequency": 2,
        "position_score": 0.95,
        "pos_score": 1.4
      },
      {
        "text": "computer science",
        "weight": 6.38,
        "frequency": 2,
        "position_score": 0.78,
        "pos_score": 1.2
      },
      {
        "text": "artificial intelligence",
        "weight": 5.92,
        "frequency": 2,
        "position_score": 0.76,
        "pos_score": 1.1
      },
      {
        "text": "human language",
        "weight": 5.68,
        "frequency": 2,
        "position_score": 0.73,
        "pos_score": 1.2
      },
      {
        "text": "contextual nuances",
        "weight": 4.85,
        "frequency": 1,
        "position_score": 0.62,
        "pos_score": 1.3
      },
      {
        "text": "understanding contents",
        "weight": 4.64,
        "frequency": 1,
        "position_score": 0.68,
        "pos_score": 1.2
      },
      {
        "text": "computers human",
        "weight": 4.32,
        "frequency": 1,
        "position_score": 0.75,
        "pos_score": 1.0
      },
      {
        "text": "capable understanding",
        "weight": 4.15,
        "frequency": 1,
        "position_score": 0.66,
        "pos_score": 1.1
      },
      {
        "text": "including contextual",
        "weight": 3.89,
        "frequency": 1,
        "position_score": 0.57,
        "pos_score": 1.15
      }
    ],
    "metadata": {
      "text_length": 304,
      "word_count": 57,
      "unique_words": 42,
      "language_detected": "en"
    }
  }
}

Other Available Endpoints

POST /v2/batch-extract

Process multiple texts in a single request for more efficient analysis.

View Details

POST /v2/analyze

Full text analysis including keywords, entities, sentiment, and readability metrics.

View Details

GET /v2/usage

Retrieve current API usage statistics and rate limit information.

View Details