Weighted N-gram Keyword Extractor

Extract valuable key phrases from your content using intelligent weighting algorithms

Input Your Text

0 words

Extraction Settings

Unigram Bigram Trigram 4-gram

Extracted Keywords

Enter your text and click "Extract Keywords" to see results

Extract keywords to generate a word cloud visualization

Extract keywords to generate a bar chart visualization

Extract keywords to generate a network visualization

How It Works

Our Weighted N-gram Keyword Extractor uses advanced natural language processing techniques to identify the most important phrases in your text. It assigns weights based on:

  • Frequency: How often a phrase appears
  • Position: Where in the text phrases appear (beginnings and endings carry more weight)
  • Part of Speech: Prioritizing noun phrases and other grammatically significant constructs

You can adjust the N-gram size to extract single words (unigrams), word pairs (bigrams), three-word phrases (trigrams), or even four-word phrases (4-grams).

Use Cases

  • Content Optimization: Identify key phrases for SEO and content strategy
  • Research Analysis: Quickly find important concepts in academic papers
  • Competitive Analysis: Extract key terms from competitor content
  • Document Summarization: Understand the main topics of lengthy documents
  • Content Creation: Generate ideas for new content based on keyword analysis

Batch Keyword Extraction

Process multiple documents simultaneously to extract keywords at scale

Drop files here or click to upload

Supports .txt, .doc, .docx, .pdf, .csv (Max 20 files, 10MB each)

Uploaded Files

No files uploaded yet

Enter Multiple Texts

Each text block will be processed separately

1
0 words

Extraction Settings

Configure settings for all documents

Processing 0 of 0

Extraction Results

No results yet. Upload files or enter texts and click "Extract Keywords"

Save Time

Process dozens of documents simultaneously instead of analyzing them one by one

Compare Documents

Identify common keywords across multiple texts to discover patterns and themes

Export Flexibility

Download results in CSV or JSON format for further analysis in your preferred tools

API Documentation

Integrate our powerful N-gram extraction capabilities directly into your applications

API Overview

Our RESTful API allows you to integrate advanced N-gram keyword extraction directly into your applications. Built with performance and flexibility in mind, you can analyze text, extract meaningful keywords, and gain valuable insights programmatically.

Simple Integration

RESTful endpoints with JSON responses make integration straightforward in any language.

Highly Configurable

Customize extraction with the same powerful settings available in our web interface.

Real-time Processing

Fast response times even for large text documents with optimized processing.

API Versions

Version Status Base URL Released
v2 (Current) Stable https://api.ngrams.org/v2/ April 2025
v1 Deprecated https://api.ngrams.org/v1/ June 2024

V1 will be sunset on December 31, 2025. We recommend migrating to v2 as soon as possible.

Authentication

All API requests require authentication using an API key. You can obtain an API key from your dashboard after signing up for an account.

API Key Authentication

Pass your API key in the request header:

HTTP Header
X-API-Key: your_api_key_here
Example Request with Authentication
cURL
curl -X POST "https://api.ngrams.org/v2/extract" \
-H "X-API-Key: your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
  "text": "Natural language processing is a subfield of linguistics, computer science, and artificial intelligence.",
  "n_gram_size": 2,
  "max_results": 10
}'
Security Best Practices
  • Keep your API key secure and never expose it in client-side code
  • Rotate your API key periodically through your dashboard
  • Set up IP restrictions for your API key in the dashboard settings

Rate Limits

To ensure fair usage and system stability, the API enforces rate limits based on your subscription plan.

Plan Requests per Minute Requests per Day Text Size Limit
Free 10 1,000 10,000 characters
Basic 60 10,000 100,000 characters
Professional 300 100,000 500,000 characters
Enterprise Custom Custom Custom

Rate Limit Headers

Each API response includes headers to help you track your rate limit usage:

HTTP Headers
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 58
X-RateLimit-Reset: 1656427200

Handling Rate Limits

When a rate limit is exceeded, the API returns a 429 Too Many Requests response. We recommend implementing exponential backoff in your applications to handle rate limiting gracefully.

JavaScript
async function fetchWithRetry(url, options, maxRetries = 5) {
  let retries = 0;
  
  while (retries < maxRetries) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        // Get retry-after header or use exponential backoff
        const retryAfter = response.headers.get('Retry-After') || Math.pow(2, retries);
        console.log(`Rate limited. Retrying in ${retryAfter} seconds...`);
        
        // Wait for the specified time
        await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
        retries++;
        continue;
      }
      
      return response;
    } catch (error) {
      if (retries === maxRetries - 1) throw error;
      retries++;
    }
  }
}

API Endpoints

Our API offers several endpoints to analyze text and extract keywords with various options.

POST /v2/extract

Extract weighted N-gram keywords from a text input.

Request Parameters

Parameter Type Required Description
text string Yes The text to analyze and extract keywords from.
n_gram_size integer No Size of N-grams to extract (1-4). Default: 2
max_results integer No Maximum number of keywords to return (1-100). Default: 20
min_frequency integer No Minimum frequency threshold for keywords. Default: 2
use_stopwords boolean No Whether to filter out common stopwords. Default: true
custom_stopwords array No Custom stopwords to include in filtering.
use_stemming boolean No Whether to apply word stemming. Default: false
case_sensitive boolean No Whether to treat casing as significant. Default: false
weights object No Customized weighting factors (frequency, position, pos).

Example Request

curl -X POST "https://api.ngrams.org/v2/extract" \
-H "X-API-Key: your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
  "text": "Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.",
  "n_gram_size": 2,
  "max_results": 10,
  "weights": {
    "frequency": 1.0,
    "position": 0.8,
    "pos": 1.2
  }
}'
const options = {
  method: 'POST',
  headers: {
    'X-API-Key': 'your_api_key_here',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.',
    n_gram_size: 2,
    max_results: 10,
    weights: {
      frequency: 1.0,
      position: 0.8,
      pos: 1.2
    }
  })
};

fetch('https://api.ngrams.org/v2/extract', options)
  .then(response => response.json())
  .then(data => console.log(data))
  .catch(error => console.error('Error:', error));
import requests

url = "https://api.ngrams.org/v2/extract"
headers = {
    "X-API-Key": "your_api_key_here",
    "Content-Type": "application/json"
}
payload = {
    "text": "Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.",
    "n_gram_size": 2,
    "max_results": 10,
    "weights": {
        "frequency": 1.0,
        "position": 0.8,
        "pos": 1.2
    }
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()
print(data)
$url = 'https://api.ngrams.org/v2/extract';
$data = [
    'text' => 'Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language. The goal is a computer capable of understanding the contents of documents, including the contextual nuances of the language within them.',
    'n_gram_size' => 2,
    'max_results' => 10,
    'weights' => [
        'frequency' => 1.0,
        'position' => 0.8,
        'pos' => 1.2
    ]
];

$options = [
    'http' => [
        'header' => "Content-type: application/json\r\nX-API-Key: your_api_key_here\r\n",
        'method' => 'POST',
        'content' => json_encode($data)
    ]
];

$context = stream_context_create($options);
$result = file_get_contents($url, false, $context);
$response = json_decode($result, true);
print_r($response);

Example Response

JSON
{
  "status": "success",
  "request_id": "req_7f5d8a923e1b4c8",
  "processing_time_ms": 87,
  "results": {
    "keywords": [
      {
        "text": "natural language",
        "weight": 8.76,
        "frequency": 3,
        "position_score": 0.92,
        "pos_score": 1.3
      },
      {
        "text": "language processing",
        "weight": 7.54,
        "frequency": 2,
        "position_score": 0.95,
        "pos_score": 1.4
      },
      {
        "text": "computer science",
        "weight": 6.38,
        "frequency": 2,
        "position_score": 0.78,
        "pos_score": 1.2
      },
      {
        "text": "artificial intelligence",
        "weight": 5.92,
        "frequency": 2,
        "position_score": 0.76,
        "pos_score": 1.1
      },
      {
        "text": "human language",
        "weight": 5.68,
        "frequency": 2,
        "position_score": 0.73,
        "pos_score": 1.2
      },
      {
        "text": "contextual nuances",
        "weight": 4.85,
        "frequency": 1,
        "position_score": 0.62,
        "pos_score": 1.3
      },
      {
        "text": "understanding contents",
        "weight": 4.64,
        "frequency": 1,
        "position_score": 0.68,
        "pos_score": 1.2
      },
      {
        "text": "computers human",
        "weight": 4.32,
        "frequency": 1,
        "position_score": 0.75,
        "pos_score": 1.0
      },
      {
        "text": "capable understanding",
        "weight": 4.15,
        "frequency": 1,
        "position_score": 0.66,
        "pos_score": 1.1
      },
      {
        "text": "including contextual",
        "weight": 3.89,
        "frequency": 1,
        "position_score": 0.57,
        "pos_score": 1.15
      }
    ],
    "metadata": {
      "text_length": 304,
      "word_count": 57,
      "unique_words": 42,
      "language_detected": "en"
    }
  }
}

Other Available Endpoints

POST /v2/batch-extract

Process multiple texts in a single request for more efficient analysis.

View Details
POST /v2/analyze

Full text analysis including keywords, entities, sentiment, and readability metrics.

View Details
GET /v2/usage

Retrieve current API usage statistics and rate limit information.

View Details

Ready to integrate keyword extraction into your application?

Sign up for a free API key and start extracting insights from your text.

Get Your API Key