Weave Code
Code Weaver
Helps Laravel developers discover, compare, and choose open-source packages. See popularity, security, maintainers, and scores at a glance to make better decisions.
Feedback
Share your thoughts, report bugs, or suggest improvements.
Subject
Message

Csv Laravel Package

league/csv

View on GitHub
Deep Wiki
Context7

layout: default title: BOM sequence detection and addition description: BOM Handling made straightforward when processing a CSV file

BOM sequences

Definition

To improve interoperability with programs interacting with CSV, the package now provides an enum Bom to help you detect the appropriate BOM sequence.

The Bom enum provides the following value :

  • Bom::Utf8 : which handles the UTF-8 BOM sequence;
  • Bom::utf16Be : which handles the UTF-16 BOM with Big-Endian sequence;
  • Bom::utf16Le : which handles the UTF-16 BOM with Little-Endian sequence;
  • Bom::utf32Be : which handles the UTF-32 BOM with Big-Endian sequence;
  • Bom::utf32Le : which handles the UTF-32 BOM with Little-Endian sequence;

Detection

From a Sequence of Characters

Bom::fromSequence(mixed $sequence): Bom
Bom::tryFromSequence(mixed $sequence): ?Bom

The Bom::tryFromSequence static method expects any type that can be converted to a string and returns the BOM sequence found at its start as a new Bom instance or null otherwise. Instead of returning null the Bom::fromSequence will throw a ValueError exception.

use League\Csv\Bom;

Bom::tryFromSequence('hello world!'); //returns null
Bom::tryFromSequence(Bom::Utf8->value.'hello world!'); //returns Bom::Utf8
Bom::tryFromSequence('hello world!'.Bom::Utf16Le->value); //returns null

From the Encoding Name

Bom::fromEncoding(string $encoding): Bom
Bom::tryFromEncoding(mixed $encoding): ?Bom

The Bom::tryFromEncoding static method expects a valid encoding name as a string if the encoding name after filtering is find a new Bom instance or null is returned otherwise. Instead of returning null the Bom::fromEncoding will throw a ValueError exception.

use League\Csv\Bom;

Bom::tryFromEncoding('utf16'); //returns Bom::Utf16Be
Bom::tryFromEncoding('u_t_F--8'); //returns Bom::Utf8
Bom::tryFromEncoding('foobar'); //returns null

Accessing the BOM properties

Once you have a valid Bom instance you can access:

  • its actual value via the Enum value property;
  • its length via the length() method;
  • the represented encoding name via the encoding() method;
  • isUtf8, isUtf16, isUtf32 method to quicky know which encoding family is used
use League\Csv\Bom;

$bom = Bom::tryFromEncoding('utf16'); //returns Bom::Utf16Be
$bom->value; //returns "\xFE\xFF"
$bom->length(); //returns 2
$bom->encoding(); //returns 'UTF-16BE'
$bom->isUtf8(); //returns false
$bom->isUtf16(); //returns true
$bom->isUtf32(); //returns false

Deprecated features

bom_match and Info::fetchBOMSequence

function League\Csv\bom_match(string $str): string
League\Csv\Info::fetchBOMSequence(string $str): ?string

The League\Csv\bom_match function expects a string and returns the BOM sequence found at its start or an empty string otherwise.

use League\Csv\ByteSequence;
use function League\Csv\bom_match;

bom_match('hello world!'); //returns ''
bom_match(ByteSequence::BOM_UTF8.'hello world!'); //returns '\xEF\xBB\xBF'
bom_match('hello world!'.ByteSequence::BOM_UTF16_BE); //returns ''

Handling CSV documents

Detecting the BOM sequence

public AbstractCsv::getInputBOM(void): string

The CSV document current BOM character is detected using the getInputBOM method. This method returns the currently used BOM character or an empty string if none is found or recognized. The detection is done using the Bom::tryFromSequence static method.

use League\Csv\Reader;

$csv = Reader::from('/path/to/file.csv');
$bom = $csv->getInputBOM();

Setting the outputted BOM sequence

public AbstractCsv::setOutputBOM(Bom|string|null $sequence): self
public AbstractCsv::getOutputBOM(void): string
  • setOutputBOM: sets the outputting BOM you want your CSV to be associated with.
  • getOutputBOM: gets the outputting BOM you want your CSV to be associated with.
use League\Csv\Bom;
use League\Csv\Reader;

$csv = Reader::from('/path/to/file.csv', 'r');
$csv->setOutputBOM(Bom::Utf8);
$bom = $csv->getOutputBOM(); //returns "\xEF\xBB\xBF"

Controlling Input BOM usage

If your document contains a BOM sequence the following methods control its presence when processing it.

AbstractCsv::skipInputBOM(): self;
AbstractCsv::includeInputBOM(): self;
AbstractCsv::isInputBOMIncluded(): bool;
  • skipInputBOM: enables skipping the input BOM from your CSV document.
  • includeInputBOM: preserves the input BOM from your CSV document while accessing its content.
  • isInputBOMIncluded: tells whether skipping or including the input BOM will be done.

If your document does not contain any BOM sequence you can speed up the CSV iterator by preserving its presence, which means that no operation to detect and remove it if present will take place.

$raw_csv = Bom::Utf8->value."john,doe,john.doe@example.com\njane,doe,jane.doe@example.com\n";
$csv = Reader::fromString($raw_csv);
$csv->setOutputBOM(Bom::Utf16Le);
$csv->includeInputBOM();
ob_start();
$csv->output();
$document = ob_get_clean();

The returned $document will contain 2 BOM markers instead of one.

Skipping the BOM Sequence

Weaver

How can I help you explore Laravel packages today?

Conversation history is not saved when not logged in.
Prompt
Add packages to context
No packages found.
davejamesmiller/laravel-breadcrumbs
artisanry/parsedown
christhompsontldr/phpsdk
enqueue/dsn
bunny/bunny
enqueue/test
enqueue/null
enqueue/amqp-tools
milesj/emojibase
bower-asset/punycode
bower-asset/inputmask
bower-asset/jquery
bower-asset/yii2-pjax
laravel/nova
spatie/laravel-mailcoach
spatie/laravel-superseeder
laravel/liferaft
nst/json-test-suite
danielmiessler/sec-lists
jackalope/jackalope-transport