API endpoint does not cap input tokens at max token limit

I made a REST API call to the text embedding models. All of them can take a maximum of 512 tokens. However, when requesting a body with many more tokens (2000 tokens), I do not receive an error. Does my input get capped at 512 tokens?
1 Reply
schkovich
schkovich8mo ago
It does not. You can try something like this if working with bert
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
if len(tokenizer.encode(text)) > 512:
# handle logng text
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
if len(tokenizer.encode(text)) > 512:
# handle logng text
or if working with ada check tiktoken.
Want results from more Discord servers?
Add your server