Tuesday, March 6, 2012

Efficient file storage for user uploaded files

While disk space is cheap, dealing with large numbers of files can be tedious, and in some cases unnecessary.  Before considering file naming conventions, consider the uniqueness of your files.  At work, we allow our customers to upload logos.  It's possible for multiple customers to upload the same logo.  The last thing we wanted was to maintain multiple copies of the same logo on the server.

We decided our best approach would be to name the files based on the contents of the file.  If the contents changed, so did the filename.

$filename = md5(file_get_contents($pathToFile));

The MD5 hashed filename is what was linked to the user logos.  This turned out to be a fantastic solution for our needs.

What naming conventions do you use for user uploaded files?  This method seems like it would work well for any uploaded files because nothing would ever be the same.  Do you agree or disagree?  Why?

Sunday, March 4, 2012

Smart MySQL Backup

A lot of web sites are hosted on shared servers via web hosting services like Hostgator where a single web server may have a hundred or more customer accounts hosted on it.  While services like this do backup your files and databases they do not guarantee them.  This means if something goes wrong and their backups couldn't be restored for any reason, you'd simply lose your data and they wouldn't be at fault.  For this reason, I do not use cPanel backups on my shared hosting account.

I personally use and recommend Smart MySQL Backup to create backup databases.  There are a lot of great features...
  • Backup individual or all databases in a separate file for each database
  • UTF-8 support
  • Daily, weekly and monthly backup rotation
  • Send backups to email
  • Upload backups to FTP
  • Handles foreign keys and stored procedures
Once I have that setup and running, I have a simple PHP script to upload the backups to Amazon S3.  These backups can be restored through cPanel/PHPMyAdmin, or via command line.

// Take db backups and copy them to s3
// https://github.com/tpyo/amazon-s3-php-class
require_once 's3.php';

$bucket = 'my-bucket-name';

$backupPath = '/path/to/backups/archive/daily';

$s3 = new S3("[awsAccessKey]", "[awsSecretKey]");

$today = date('Y-m-d');
$expired = date('Y-m-d', strtotime('-5 days'));
foreach(glob($backupPath.'/'.$today.'/*.bz2') as $file) {
    $fileInfo = pathinfo($file);
    
    //move backup file to s3
    $s3->putObject(S3::inputFile($file), $bucket, $fileInfo['basename'], S3::ACL_PRIVATE);
    
    //remove expired files
    $s3->deleteObject($bucket, str_replace($today, $expired, $fileInfo['basename']));
}

PHP file upload MIME type is unreliable

I ran into an odd issue this week where a PDF being uploaded by a user through the latest Firefox wasn't properly detected as a PDF.  The "type" showed a value of "application/x-word-xxx" instead of "application/pdf".  On my own computer, Firefox and Chrome worked fine as expected.

The file type header is defined by the browser handling the uploading.  The fix for this appears to be to use mime_content_type on the file after it's been uploaded instead of relying on $_FILES.

echo mime_content_type('php.gif') . "\n"; //image/gif
echo mime_content_type('test.php'); //text/plain