^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) The utf8data.h file in this directory is generated from the Unicode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Character Database for version 12.1.0 of the Unicode standard.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) The full set of files can be found here:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) http://www.unicode.org/Public/12.1.0/ucd/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) Individual source links:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) https://www.unicode.org/Public/12.1.0/ucd/CaseFolding.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) https://www.unicode.org/Public/12.1.0/ucd/DerivedAge.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) https://www.unicode.org/Public/12.1.0/ucd/extracted/DerivedCombiningClass.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) https://www.unicode.org/Public/12.1.0/ucd/DerivedCoreProperties.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) https://www.unicode.org/Public/12.1.0/ucd/NormalizationCorrections.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) https://www.unicode.org/Public/12.1.0/ucd/NormalizationTest.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) https://www.unicode.org/Public/12.1.0/ucd/UnicodeData.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) md5sums (verify by running "md5sum -c README.utf8data"):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) 900e76da1d822a160fd6b8c0b1d70094 CaseFolding.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) 131256380bff4fea8ad4a851616f2f10 DerivedAge.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) e731a4089b30002144e107e3d6f8d1fa DerivedCombiningClass.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) a47c9fbd7ff92a9b261ba9831e68778a DerivedCoreProperties.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) fcab6dad15e440879d92f315978f93d3 NormalizationCorrections.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) f9ff1c55a60decf436100f791b44aa98 NormalizationTest.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) 755f6af699f8c8d2d958da411f78f6c6 UnicodeData.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) sha1sums (verify by running "sha1sum -c README.utf8data"):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) dc9245f6803c4ac99555c361f5052e0b13eb779b CaseFolding.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) 3281104f237184cdb5d869e86eb8573678ada7da DerivedAge.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) 2f5f995ccb96e0fa84b15151b35d5e2681535175 DerivedCombiningClass.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) 5b8698a3fcd5018e1987f296b02e2c17e696415e DerivedCoreProperties.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) cd83935fbc012345d8792d2c704f69497e753835 NormalizationCorrections.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) ea419aae505b337b0d99a83fa83fe58ddff7c19f NormalizationTest.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) dc973c0fc93d6f09d9ab9f70d1c9f89c447f0526 UnicodeData.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) To update to the newer version of the Unicode standard, the latest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) released version of the UCD can be found here:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) http://www.unicode.org/Public/UCD/latest/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) Then, build under fs/unicode/ with REGENERATE_UTF8DATA=1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) make REGENERATE_UTF8DATA=1 fs/unicode/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) After sanity checking the newly generated utf8data.h file (the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) version generated from the 12.1.0 UCD should be 4,109 lines long, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) have a total size of 324k) and/or comparing it with the older version
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) of utf8data.h_shipped, rename it to utf8data.h_shipped.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) If you are a kernel developer updating to a newer version of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) Unicode Character Database, please update this README.utf8data file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) with the version of the UCD that was used, the md5sum and sha1sums of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) the *.txt files, before checking in the new versions of the utf8data.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) and README.utf8data files.