Equivset/ja

Equivset is a library for detecting visually similar UTF-8 characters.

Equivset is designed to prevent abuse through imitation of words and focusses primarily on letters and punctuation (not emojis or other symbols). It contains mapping of visually identical characters from Unicode Confusables such as Latin "A" and Greek "Α" (alpha), as well as additional mapping for visually similar characters such as "S" and "$" (dollar sign).

It is used at Wikimedia in the and  software to determine if two characters are visually equivalent.

データ
The library provides its dataset of equivalent set of characters in a standard JSON format and a plain text format (browse files)

It also provides an access library for PHP.

