Reid Kleckner 2b8c69204b [Windows] Convert from UTF-8 to UTF-16 when writing to a Windows console
Summary:
Calling WriteConsoleW is the most reliable way to print Unicode
characters to a Windows console.

If binary data gets printed to the console, attempting to re-encode it
shouldn't be a problem, since garbage in can produce garbage out.

This breaks printing strings in the local codepage, which WriteConsoleA
knows how to handle. For example, this can happen when user source code
is encoded with the local codepage, and an LLVM tool quotes it while
emitting a caret diagnostic. This is unfortunate, but well-behaved tools
should validate that their input is UTF-8 and escape non-UTF-8
characters before sending them to raw_fd_ostream. Clang already does
this, but not all LLVM tools do this.

One drawback to the current implementation is printing a string a byte
at a time doesn't work. Consider this LLVM code:
  for (char C : MyStr) outs() << C;

Because outs() is now unbuffered, we wil try to convert each byte to
UTF-16, which will fail. However, this already didn't work, so I think
we may as well update callers that do that as we find them to print
complete portions of strings. You can see a real example of this in my
patch to SourceMgr.cpp

Fixes PR38669 and PR36267.

Reviewers: zturner, efriedma

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D51558

llvm-svn: 341433
2018-09-05 00:08:56 +00:00
..
2018-07-30 19:41:25 +00:00
2018-02-26 13:05:18 +00:00
2018-08-23 09:42:58 +00:00
2018-08-04 00:23:37 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-05-14 12:53:11 +00:00
2018-03-02 22:00:38 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-04-02 13:49:35 +00:00
2018-07-30 19:41:25 +00:00
2018-03-09 00:23:35 +00:00
2018-07-30 19:41:25 +00:00
2018-07-30 19:41:25 +00:00
2018-04-29 00:45:03 +00:00
2018-07-30 19:41:25 +00:00

Design Of lib/System
====================

The software in this directory is designed to completely shield LLVM from any
and all operating system specific functionality. It is not intended to be a
complete operating system wrapper (such as ACE), but only to provide the
functionality necessary to support LLVM.

The software located here, of necessity, has very specific and stringent design
rules. Violation of these rules means that cracks in the shield could form and
the primary goal of the library is defeated. By consistently using this library,
LLVM becomes more easily ported to new platforms since the only thing requiring
porting is this library.

Complete documentation for the library can be found in the file:
  llvm/docs/SystemLibrary.html
or at this URL:
  http://llvm.org/docs/SystemLibrary.html

While we recommend that you read the more detailed documentation, for the
impatient, here's a high level summary of the library's requirements.

 1. No system header files are to be exposed through the interface.
 2. Std C++ and Std C header files are okay to be exposed through the interface.
 3. No exposed system-specific functions.
 4. No exposed system-specific data.
 5. Data in lib/System classes must use only simple C++ intrinsic types.
 6. Errors are handled by returning "true" and setting an optional std::string
 7. Library must not throw any exceptions, period.
 8. Interface functions must not have throw() specifications.
 9. No duplicate function impementations are permitted within an operating
    system class.

To accomplish these requirements, the library has numerous design criteria that
must be satisfied. Here's a high level summary of the library's design criteria:

 1. No unused functionality (only what LLVM needs)
 2. High-Level Interfaces
 3. Use Opaque Classes
 4. Common Implementations
 5. Multiple Implementations
 6. Minimize Memory Allocation
 7. No Virtual Methods