Find directory size

Jan 30, 2009 at 3:48 PM
I'm trying to retrieve the space used by a directory on S3.   We wish to be able to allow users to upload to our S3 based storage, but need to monitor how much disk space is being used by each user.

Anyone any ideas on how to achieve this?  I've tried getting the directory and then accessing the TotalBytes property, but this always returns 0. 

It feels like there must be a way of doing this but I've spent ages trying to find the appropriate method.

I'm sure it goes without saying but recursively going through all the files in the directory, and checking the stream size for all files individually is not an option, it could mean downloading several GB of data.  What I'm looking for is a way of querying S£ using threesharp library to find out the total bytes stored in a directory.

Mar 2, 2009 at 5:27 PM
Edited Mar 2, 2009 at 5:33 PM
If anyone else is looking for a solution to this, the best I have come up with is this helper function....

        public long GetBytes(string Path)

            ThreeSharpConfig config = new ThreeSharpConfig();
            config.AwsAccessKeyID = "<your access key>";
            config.AwsSecretAccessKey = "<secret key>";
            config.ConnectionLimit = 20;
            config.IsSecure = false;
            config.Format = CallingFormat.SUBDOMAIN;

            ThreeSharpQuery service = new ThreeSharpQuery(config);

            BucketListRequest request = null;
            BucketListResponse response = null;
            request = new BucketListRequest("<YOUR_BUCKET_NAME>");
            request.QueryList.Add("prefix", Path);
            response = service.BucketList(request);

            XmlDocument bucketXml = response.StreamResponseToXmlDocument();
            XmlNodeList objects = bucketXml.SelectNodes("//*[local-name()='Size']");

            long fileSize = 0;
            foreach (XmlNode obj in objects)
                fileSize += Convert.ToInt64(obj.InnerText);
            response = service.BucketList(request);
            return fileSize;

This seems to work as expected however it can be slow (I am currently testing on a fairly volatile ADSL connection so that may be because of that).  In my scenario I am ensuring the total size is cached and updated each time a a new file is uploaded.

If anyone knows of a better solution I'd be grateful, but in the meantime thought I'd share the above simple solution.  This function receives a path, and then retrieves the XML request containing all matching s3 objects.  Then a simple query is run on the XML response, to return the Size nodes, which are then cycled through and added to a variable of type long.  May well not be the best method, but it works which is half the battle!