I want to create a read stream that previously is gzencoded from a plain text.
The google cloud storage library has an upload function and you can pass a StreamInterface as parameter (Bucket::upload reference)
I want to upload a .txt file but gzencoded.
To upload a txt file is simple:
/** @var \Google\Cloud\Storage\Bucket $bucket */
$fd = fopen('/tmp/file.txt', 'r');
$stream = \GuzzleHttp\Psr7\Utils::streamFor($fd);
$bucket->upload($stream, ['name' => 'file.txt']);
I want to create a stream that:
And not storing the full file in memory (just the chunks) neither in disk. Is this possible?
I think it should be something like the following code, but creating a gz file (instead of zliz.deflating the data):
$fd = fopen('/tmp/file.txt', 'r');
stream_filter_append($fd, 'zlib.deflate', STREAM_FILTER_READ, ['window' => 15]);
$stream = Psr7\Utils::streamFor($fd);
$bucket->upload($stream, ['name' => 'file.txt.gz']);
Thanks!
I got a bit nerd-sniped and had to write something for this.
Reposting my above comment:
While DEFLATE is the algorithm used by gzip, it is not the format. This is laid out in the response to bugs.php.net/bug.php?id=68556. This stream filter appears to use the DEFLATE format header and trailer, and there does not currently seem to be a built-in gzip stream filter.
Well we can shim in a call to the system's gzip binary with proc_open()
and stream the data through that to create a properly-formatted gzip stream.
class GzipCommandFilter extends php_user_filter {
public $stream;
private $ph, $pipes;
public function onCreate(): bool {
$this->ph = proc_open(
[ 'gzip', '-c', '-'],
[
['pipe', 'r'],
['pipe', 'w'],
['pipe', 'w']
],
$this->pipes
);
if( $this->ph === false ) {
return false;
}
stream_set_blocking($this->pipes[1], false);
stream_set_blocking($this->pipes[2], false);
return true;
}
public function filter($in, $out, &$consumed, $closing): int {
$written = 0;
while ($bucket = stream_bucket_make_writeable($in)) {
fwrite($this->pipes[0], $bucket->data);
$consumed += $bucket->datalen;
$out_buf = stream_get_contents($this->pipes[1]);
$written += strlen($out_buf);
$bucket->data = $out_buf;
stream_bucket_append($out, $bucket);
}
if( $closing ) {
fclose($this->pipes[0]); // closing stdin to signal completion
$this->waitOnProc(); // let gzip process the remaining buffer
stream_bucket_append($out, stream_bucket_new($this->stream, stream_get_contents($this->pipes[1])));
return PSFS_PASS_ON;
} else if( $written > 0 ) {
return PSFS_PASS_ON;
} else {
return PSFS_FEED_ME;
}
}
protected function waitOnProc($step=1000, $max=1000000) {
$waited = 0;
while( ($status = proc_get_status($this->ph))['running'] === true ) {
usleep($step);
$waited += $step;
if( $waited >= $max ) {
throw new \Exception('Timed out while waiting.');
}
}
}
}
stream_filter_register('gzip', 'GzipCommandFilter');
and we would use it like:
$fh = fopen('/tmp/file.txt', 'rb');
stream_filter_append($fh, 'gzip');
$data = stream_get_contents($fh);
printf("Data: %s\nDecoded: %s\n", bin2hex($data), gzdecode($data));
Which might output something like:
Data: 1f8b0800000000000003cb48cdc9c95728cf2fca495104006dc2b4030c000000
Decoded: hello world!