Georgi Gerganov
92139b90af
tests : add test-tokenizer-0.sh + fix some tokenizers (#7036)
* tests : add test-tokenizer-0.sh
* unicode : add all unicode number ranges
* starcoder : fix pre-tokenizer
* tests : add test that fails with DeepSeek tokenizers
* falcon : fix regex
* unicode : regenerate unicode tables
* refact : add tokenizer model
* lint : fix
* tests : disable failing tests
ggml-ci
* refact : add tests files
ggml-ci
* convert : print -> logging
ggml-ci
* lint : fix
* unicode : digit -> number
* phi-3 : update
2024-05-04 08:32:32 +03:00
..
2023-08-21 23:07:43 +03:00
2023-10-03 09:16:26 +02:00
2023-10-24 09:17:17 +02:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2023-12-28 15:03:57 +01:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2023-10-24 09:17:17 +02:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00
2024-04-29 16:58:41 +03:00
2024-04-29 16:58:41 +03:00
2024-05-04 08:32:32 +03:00
2024-05-04 08:32:32 +03:00