Flume user guide Apache Flume Flume User Guide Introduction Overview Apache Flume is a distributed reliable and available system for ef ?ciently collecting aggregating and moving large amounts of log data from many di ?erent sources to a centralized data
Apache Flume Flume User Guide Introduction Overview Apache Flume is a distributed reliable and available system for ef ?ciently collecting aggregating and moving large amounts of log data from many di ?erent sources to a centralized data store The use of Apache Flume is not only restricted to log data aggregation Since data sources are customizable Flume can be used to transport massive quantities of event data including but not limited to network traf ?c data social-media-generated data email messages and pretty much any data source possible Apache Flume is a top level project at the Apache Software Foundation There are currently two release code lines available versions x and x Documentation for the x track is available at the Flume x User Guide This documentation applies to the x track New and existing users are encouraged to use the x releases so as to leverage the performance improvements and con ?guration exibilities available in the latest architecture System Requirements Java Runtime Environment - Java or later Memory - Suf ?cient memory for con ?gurations used by sources channels or sinks Disk Space - Suf ?cient disk space for con ?gurations used by channels or sinks Directory Permissions - Read Write permissions for directories used by agent Architecture Data ow model A Flume event is de ?ned as a unit of data ow having a byte payload and an optional set of string attributes A Flume agent is a JVM process that hosts the components through which events ow from an external source to the next destination hop A Flume source consumes events delivered to it by an external source like a web server The external source sends events to Flume in a format that is recognized by the target Flume source For example an Avro Flume source can be used to receive Avro events from Avro clients or other Flume agents in the ow that send events from an Avro sink A similar ow can be de ?ned using a Thrift Flume Source to receive events from a Thrift Sink or a Flume Thrift Rpc Client or Thrift clients written in any language generated from the Flume thrift protocol When a Flume source receives an event it stores it into one or more channels The channel is a passive store that keeps the event until it ? s consumed by a Flume sink The ?le channel is one example ?? it is backed by the local ?lesystem The sink removes the event from the channel and puts it into an external repository like HDFS via Flume HDFS sink or forwards it to the Flume source of the next Flume agent next hop in the ow The source and sink within the given agent run asynchronously with the events staged in the channel Complex ows Flume allows a user to build multi-hop ows where events travel through multiple agents before reaching the ?nal destination It also allows fan-in and fan-out ows contextual routing and backup routes fail-over for failed hops
Documents similaires










-
29
-
0
-
0
Licence et utilisation
Gratuit pour un usage personnel Attribution requise- Détails
- Publié le Fev 08, 2021
- Catégorie Business / Finance
- Langue French
- Taille du fichier 369.1kB