summaryrefslogtreecommitdiffstats
path: root/Source/cm_utf8.c
Commit message (Collapse)AuthorAgeFilesLines
* cm_utf8: Fail on empty input rangeBrad King2022-01-211-0/+5
| | | | Issue: #23132
* cm_utf8: add an is_valid functionBen Boeckel2019-03-181-0/+19
|
* cm_utf8: reject codepoints above 0x10FFFFBen Boeckel2019-03-141-0/+5
| | | | | These are invalid because the Unicode standard says so (because UTF-16 as specified today cannot encode them).
* cm_utf8: reject UTF-16 surrogate half codepointsBen Boeckel2019-03-141-0/+5
|
* codecvt: Re-implement do_out and do_unshiftBrad King2017-05-251-1/+1
| | | | | | | | | | | | | The previous implementation assumed that only one byte would be given in the `from` buffer by the caller at a time. This may be true for MSVC but is not for the GNU library on Windows. Re-implement these methods to handle more than one byte per call. Also simplify the state management by keeping all state between calls directly in the `mbstate_t` argument instead of using it to index our own heap-allocated state. Fixes: #16893
* Simplify CMake per-source license noticesBrad King2016-09-271-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Per-source copyright/license notice headers that spell out copyright holder names and years are hard to maintain and often out-of-date or plain wrong. Precise contributor information is already maintained automatically by the version control tool. Ultimately it is the receiver of a file who is responsible for determining its licensing status, and per-source notices are merely a convenience. Therefore it is simpler and more accurate for each source to have a generic notice of the license name and references to more detailed information on copyright holders and full license terms. Our `Copyright.txt` file now contains a list of Contributors whose names appeared source-level copyright notices. It also references version control history for more precise information. Therefore we no longer need to spell out the list of Contributors in each source file notice. Replace CMake per-source copyright/license notice headers with a short description of the license and links to `Copyright.txt` and online information available from "https://cmake.org/licensing". The online URL also handles cases of modules being copied out of our source into other projects, so we can drop our notices about replacing links with full license text. Run the `Utilities/Scripts/filter-notices.bash` script to perform the majority of the replacements mechanically. Manually fix up shebang lines and trailing newlines in a few files. Manually update the notices in a few files that the script does not handle.
* Revise C++ coding style using clang-formatKitware Robot2016-05-161-32/+35
| | | | | | | | | | | | | Run the `Utilities/Scripts/clang-format.bash` script to update all our C++ code to a new style defined by `.clang-format`. Use `clang-format` version 3.8. * If you reached this commit for a line in `git blame`, re-run the blame operation starting at the parent of this commit to see older history for the content. * See the parent commit for instructions to rebase a change across this style transition commit.
* Remove `//------...` horizontal separator commentsBrad King2016-05-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Modern editors provide plenty of ways to visually separate functions. Drop the explicit comments that previously served this purpose. Use the following command to automate the change: $ git ls-files -z -- \ "*.c" "*.cc" "*.cpp" "*.cxx" "*.h" "*.hh" "*.hpp" "*.hxx" | egrep -z -v "^Source/cmCommandArgumentLexer\." | egrep -z -v "^Source/cmCommandArgumentParser(\.y|\.cxx|Tokens\.h)" | egrep -z -v "^Source/cmDependsJavaLexer\." | egrep -z -v "^Source/cmDependsJavaParser(\.y|\.cxx|Tokens\.h)" | egrep -z -v "^Source/cmExprLexer\." | egrep -z -v "^Source/cmExprParser(\.y|\.cxx|Tokens\.h)" | egrep -z -v "^Source/cmFortranLexer\." | egrep -z -v "^Source/cmFortranParser(\.y|\.cxx|Tokens\.h)" | egrep -z -v "^Source/cmListFileLexer\." | egrep -z -v "^Source/cm_sha2" | egrep -z -v "^Source/(kwsys|CursesDialog/form)/" | egrep -z -v "^Utilities/(KW|cm).*/" | xargs -0 sed -i '/^\(\/\/---*\|\/\*---*\*\/\)$/ {d;}' This avoids modifying third-party sources and generated sources.
* Fix or cast more integer conversions in cmakeBrad King2010-06-291-1/+1
| | | | | These were revealed by GCC's -Wconversion option. Fix types where it is easy to do so. Cast in cases we know the integer will not be truncated.
* Fix or cast integer conversions in cmakeBrad King2010-06-251-2/+2
| | | | | These were revealed by GCC's -Wconversion option. Fix types where it is easy to do so. Cast in cases we know the integer will not be truncated.
* CTest: Do not munge UTF-8 output in XML filesBrad King2009-12-081-0/+84
CTest filters the output from tools and tests to ensure that the XML build/test result documents it generates have valid characters. Previously we just converted all non-ASCII bytes into XML-escaped Unicode characters of the corresponding index. This does not preserve tool output encoded in UTF-8. We now assume UTF-8 output from tools and implement decoding as specified in RFC 3629. Valid characters are preserved, possibly with XML escaping. Invalid byte sequences and characters are converted to human-readable hex values with distinguishing tags. See issue #10003.