On Wed, Sep 19, 2012 at 12:49 PM, Cristian Rodríguez
<crrodriguez(a)opensuse.org> wrote:
I happen to agree that the binary format is
unnecessary,
Ok, then how you implement all those features mentioned in the design
document to work fast ?
So, quoting the doc:
Simplicity: little code with few dependencies and
minimal waste through abstraction.
Text files are as simple as it gets.
Zero Maintenance: logging is crucial functionality
to debug and monitor systems, as such it should not be a problem source of its own, and
work as well as it can even in dire circumstances. For example, that means the system
needs to react gracefully to problems such as limited disk space or /var not being
available, and avoid triggering disk space problems on its own (e.g. by implementing
journal file rotation right in the daemon at the time a journal file is extended).
Nothing to do with text files or not, it's an implementation issue.
Robustness: data files generated by the journal
should be directly accessible to administrators and be useful when copied to different
hosts with tools like “scp” or “rsync”. Incomplete copies should be processed gracefully.
Journal file browsing clients should work without the journal daemon being around.
Text files provide this natively
Portable: journal files should be usable across
the full range of Linux systems, regardless which CPU or endianess is used. Journal files
generated on an embedded ARM system should be viewable on an x86 desktop, as if it had
been generated locally.
ASCII text is the epitome of portability. Use UTF-8 if you need
unicode support. In any case, highly portable.
Performance: journal operations for appending and
browsing should be fast in terms of complexity. O(log n) or better is highly advisable, in
order to provide for organization-wide log monitoring with good performance
Appending is O(1) (assuming constant line length) on text files. If
browsing means reading the first, second, third etc entries sorted by
time, then browsing is also O(1) in text files. If browsing means
reading the entry at time X (for arbitrary time X), then it's O(log n)
in text files by means of bisection, as text-based logs are naturally
sorted by timestamp.
If you need to search by other fields, you need indices. This can be
done with text files, but syslog-ng already supports a better way:
just dump everything into a (pattern) database as it comes. In any
case, if you introduce indices, appending is no longer O(1).
Integration: the journal should be closely
integrated with the rest of the system, so that logging is so basic for a service, that it
would need to opt-out of it in order to avoid it. Logging is a core responsibility for a
service manager, and it should be integrated with it reflecting that.
Again, has nothing to do with text or not.
Minimal Footprint: journal data files should be
small in disk size, especially in the light that the amount of data generated might be
substantially bigger than on classic syslog.
While text files have a tendency to include some bloat, both because
of the format and because of the verbose nature of human-readable
messages, data compression can easily compensate those redundancies,
and it has been the mainstream for logging for ages. The fact that
compression happens at log rotation does not detract from its
effectiveness, and some compression schemes are even random-readable
(think bzip2). Other compression schemes are packetizable (deflate),
making them suitable for record-by-record on-the-fly compression.
Compressed formats are technically binary, but in practice, due to the
ubiquity and transparency of decompression tools, rather equivalent to
text files. You can even zless and zgrep.
General Purpose Event Storage: the journal should
be useful to store any kind of journal entry, regardless of its format, its meta data or
size.
Text files can contain arbitrary data, so there's nothing more
general-purpose than text files. If you need binary blobs, you have
various codings that will result in ASCII-safe text representing it.
Just to mention two, hex and base-64.
Unification: the numerous different logging
technologies should be unified so that all loggable events end up in the same data store,
so that global context of the journal entries is stored and available later. e.g. a
firmware entry is often followed by a kernel entry, and ultimately a userspace entry. It
is key that the relation between the three is not lost when stored on disk.
Base for Higher Level Tools: the journal should provide a generally useful API which
can be used by health monitors, recovery tools, crash report generators and other higher
level tools to access the logged journal data.
Universality: as a basic building block of the OS the journal should be universal
enough and extensible to cater for application-specific needs. The format needs to be
extensible, and APIs need to be available.
Clustering & Network: Today computers seldom work in isolation. It is crucial
that logging caters for that and journal files and utilities are from the ground on
developed to support big multi-host installations.
Unrelated to store format. "Data store" here can easily be a series of
text files.
Scalability: the same way as Linux scales from
embedded machines to super computers and clusters, the journal should scale, too. Logging
is key when developing embedded devices, and also essential at the other end of the
spectrum, for maintaining clusters. The journal needs to focus on generalizing the common
use patterns while catering for the specific differences, and staying minimal in
footprint.
Again, no reason why text files can't scale. There's no need to stuff
everything into a single text file, and abstraction APIs can easily
query several text files in tandem. Text-vs-binary provides no
features in this regard.
Security: Journal files should be authenticated to
make undetected manipulation impossible.
Text-based log lines can also be authenticated.You have MACs, PGP
ASCII armor, etc...
Even assuming MACs on both implementations, I seriously doubt journal
files are safer from tampering by root, though, leaving it in the same
situation as syslog is. If root can create a log file, root can tamper
with it.
--
To unsubscribe, e-mail: opensuse-factory+unsubscribe(a)opensuse.org
To contact the owner, e-mail: opensuse-factory+owner(a)opensuse.org