<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>The Storage Management Blog</title>
	<atom:link href="http://storagemanagement.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://storagemanagement.wordpress.com</link>
	<description>Getting the best out of your unstructured data</description>
	<lastBuildDate>Mon, 12 Sep 2011 18:37:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='storagemanagement.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>The Storage Management Blog</title>
		<link>http://storagemanagement.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://storagemanagement.wordpress.com/osd.xml" title="The Storage Management Blog" />
	<atom:link rel='hub' href='http://storagemanagement.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Finding file duplicates can be a chore</title>
		<link>http://storagemanagement.wordpress.com/2011/09/09/finding-file-duplicates-can-be-a-chore/</link>
		<comments>http://storagemanagement.wordpress.com/2011/09/09/finding-file-duplicates-can-be-a-chore/#comments</comments>
		<pubDate>Fri, 09 Sep 2011 16:34:21 +0000</pubDate>
		<dc:creator>g2-661f14d20e785bc44452f17a4b50157e</dc:creator>
				<category><![CDATA[Cleaning up storage]]></category>
		<category><![CDATA[File duplicates]]></category>
		<category><![CDATA[duplicate files]]></category>
		<category><![CDATA[find file duplicates]]></category>

		<guid isPermaLink="false">http://storagemanagement.wordpress.com/?p=31</guid>
		<description><![CDATA[Finding and managing duplicate files is often the first thing admins do to try to clean up storage, so in this blog we thought we&#8217;d take time to look at how to do this in a little more detail.  Finding duplicated files can be a chore.  If you haven&#8217;t done it before, there are usually [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=31&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://storagemanagement.files.wordpress.com/2011/09/dups-list1.gif"><img class="alignright size-full wp-image-38" title="Find duplicate files" src="http://storagemanagement.files.wordpress.com/2011/09/dups-list1.gif?w=272&#038;h=233" alt="Find duplicate files" width="272" height="233" /></a>Finding and managing duplicate files is often the first thing admins do to try to clean up storage, so in this blog we thought we&#8217;d take time to look at how to do this in a little more detail.  Finding duplicated files can be a chore.  If you haven&#8217;t done it before, there are usually many thousands of them on your network.  In fact our experience shows that they can make up more than 30% of your used storage!  If you include files stored as attachments in emails this can be even higher.  You can easily produce a list of them with one of the many simple tools available &#8211; but this will just confirm the scale of the problem.  The difficult part is finding time to do something about it.   That requires a more sophisticated approach.</p>
<h2>What do you want to achieve?</h2>
<p>Previously we blogged about <a title="How to archive file duplicates" href="http://storagemanagement.wordpress.com/2011/09/04/archiving-file-duplicates/">how to archive duplicates</a> &#8211; and how to automate this, so you don&#8217;t have to spend time manually finding and dealing with file duplicates.  How do you find the interesting ones in the first place &#8211; the ones that you want to spend time on?  First you need to decide what you are trying to achieve, e.g.</p>
<ul>
<li>Clean-up as much storage as possible, as quickly as possible</li>
<li>Find out which users are generating the most duplicates, and tell them about it</li>
<li>Check where most file duplicates are being stored &#8211; and if I have problems with whole folders being duplicates</li>
<li>Decide if I am concerned about any type of file, or just ones I hate, like audio or video files.</li>
</ul>
<h2>How accurate should I be?</h2>
<p>Having decided what you are looking for, you need to decide how accurately you want to search.  This is a tradeoff between speed and performance &#8211; unless you choose a tool where this can be automated to a schedule.  If you want to clean-up across your entire network you need to consider this area carefully.  Scaling up duplicates file search can be a challenge and is highly tool dependent.</p>
<p>A good example of this issue is whether or not to use &#8220;<a title="Checksum" href="http://en.wikipedia.org/wiki/Checksum" target="_blank">checksum</a>&#8221; comparison to find duplication.  In this method the checksum of each file is calculated and compared with each other file.  To calculate a checksum the data in each file must be read and a calculation applied.  When files get to larger sizes (e.g. 50MB or more) then calculating a checksum can take minutes, even in the best conditions (when a file is stored on a local filesystem and the PC or server is powerful and not running other tasks).  Some tools let you do a &#8220;byte by byte&#8221; comparison of files, rather than calculating and comparing checksums.  This takes even longer and arguably gives very little extra accuracy.</p>
<p>So how accurate is a checksum comparison of files?  This depends on the algorithm used.  With a 32 bit checksum algorithm we estimate that the likelihood of getting an incorrect duplication result on a large network will be once every 20 years or so&#8230; if you&#8217;re cleaning up your storage (rather than carrying out forensic investigation, for example) then you&#8217;re probably not too bothered by this margin for error.  In fact, for most purposes you can probably dispense with checksum calculation altogether and just use meta data like file update date, size and file type to clean up.</p>
<h2>How do I want to organize results?</h2>
<p>You&#8217;ve decided what you want to achieve, and how accurate you want to be.  Next its worth thinking about how to organize the results of your analysis.  If you want to find out whose creating most duplication then looking at results by file owner would be useful.  If you want to find duplicated directories then viewing by folder would be best.  If you just want to clean up wasted storage then using file lists would be most easy.</p>
<p style="text-align:center;"><a href="http://storagemanagement.files.wordpress.com/2011/09/dups-list2.gif"><img class="alignnone size-full wp-image-43" title="File duplicates by file" src="http://storagemanagement.files.wordpress.com/2011/09/dups-list2.gif?w=272&#038;h=233" alt="File duplicates by file" width="272" height="233" /></a><br />
<em>An example of file duplicates organized by file</em></p>
<p style="text-align:center;"><em><a href="http://storagemanagement.files.wordpress.com/2011/09/dups-list-users1.gif"><img class="alignnone size-full wp-image-46" title="dups-list-users" src="http://storagemanagement.files.wordpress.com/2011/09/dups-list-users1.gif?w=284&#038;h=176" alt="" width="284" height="176" /></a><br />
An example of file duplicates organized by creator</em></p>
<p style="text-align:center;"><a href="http://storagemanagement.files.wordpress.com/2011/09/dups-list-folder.gif"><img class="alignnone size-full wp-image-48" title="dups-list-folder" src="http://storagemanagement.files.wordpress.com/2011/09/dups-list-folder.gif?w=327&#038;h=244" alt="" width="327" height="244" /></a><br />
<em>An example of file duplication by folder</em></p>
<p style="text-align:left;">The examples above come from our favourite storage management tool, <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch Storage Suite</a>. Its Duplicates Finder lets you choose from many options to filter where &#8211; and what &#8211; you look for.  It then uses databases-driven search to give very fast results.  What&#8217;s more it will find duplicates across multiple servers, storage appliances &#8211; and even email systems.</p>
<h2>What to do with the results?</h2>
<p>So you&#8217;ve got the results you want, organized as you want them.  What next?  With <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch</a> you can:</p>
<ul>
<li>Generate and email a report &#8211; e.g. to the owners of the duplicates</li>
<li>Carry out a file action like copy, move, delete &#8211; or more advanced actions like archiving or sending to compressed folders</li>
<li>Save the results &#8211; e.g. to publish on an intranet site, or use in another report</li>
<li>Save the search &#8211; so you can run the exact same search with one click, or automate it to run to a schedule.</li>
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/storagemanagement.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/storagemanagement.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/storagemanagement.wordpress.com/31/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=31&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://storagemanagement.wordpress.com/2011/09/09/finding-file-duplicates-can-be-a-chore/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">g2-661f14d20e785bc44452f17a4b50157e</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/dups-list1.gif" medium="image">
			<media:title type="html">Find duplicate files</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/dups-list2.gif" medium="image">
			<media:title type="html">File duplicates by file</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/dups-list-users1.gif" medium="image">
			<media:title type="html">dups-list-users</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/dups-list-folder.gif" medium="image">
			<media:title type="html">dups-list-folder</media:title>
		</media:content>
	</item>
		<item>
		<title>Building your business case</title>
		<link>http://storagemanagement.wordpress.com/2011/09/05/building-your-business-case/</link>
		<comments>http://storagemanagement.wordpress.com/2011/09/05/building-your-business-case/#comments</comments>
		<pubDate>Mon, 05 Sep 2011 19:06:10 +0000</pubDate>
		<dc:creator>g2-661f14d20e785bc44452f17a4b50157e</dc:creator>
				<category><![CDATA[Planning]]></category>
		<category><![CDATA[business cases]]></category>
		<category><![CDATA[change management]]></category>
		<category><![CDATA[financial management]]></category>
		<category><![CDATA[planning]]></category>
		<category><![CDATA[scenarios]]></category>

		<guid isPermaLink="false">http://storagemanagement.wordpress.com/?p=27</guid>
		<description><![CDATA[Business cases are used to justify a proposed change.  They include reasoning and background for those who will provide the resources to carry out the change.  Typically they are a key step in securing the money required to implement a plan &#8211; like investing in more storage. Usually, the easiest part of the business case [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=27&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Business cases are used to justify a proposed change.  They include reasoning and background for those who will provide the resources to carry out the change.  Typically they are a key step in securing the money required to implement a plan &#8211; like investing in more storage.</p>
<p>Usually, the easiest part of the business case is describing what you want to do &#8211; the &#8220;end state&#8221; or final infrastructure configuration.  This is the &#8220;new&#8221; stuff, and suppliers are always at hand with white papers and product data sheets to help with the detail.  Conversely, often the most difficult part is building the financial justification &#8211; what will be improved as a result of this change, or what cost will be reduced?  The reason this is difficult is because it requires a detailed understanding of your current infrastructure &#8211; and no supplier can give you a data sheet covering that!</p>
<p><a href="http://storagemanagement.files.wordpress.com/2011/09/scenarios-chart.gif"><img class="alignnone size-full wp-image-28" title="Planning for change with scenarios" src="http://storagemanagement.files.wordpress.com/2011/09/scenarios-chart.gif?w=628&#038;h=388" alt="Planning for change with scenarios" width="628" height="388" /></a></p>
<p>So what you need for your next storage business case is a painless way to find out what&#8217;s out there &#8211; and what the impact of a planned change might be.  Step up <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch Storage Suite</a>.  Using it&#8217;s Scenario Planning tool its possible to choose all &#8211; or part &#8211; of your storage, choose a change scenario like &#8220;all files unused for 3 months&#8221;, then calculate in seconds the impact of that change.  What&#8217;s more, the change impact is calculated by looking at what the historic impact would have been on your storage, then extrapolating this forward.  Add in your own TCO (total cost of ownership) figures and it will provide the financial impact as well &#8211; all ready to be pasted into your business case paper.</p>
<p>In the example above we&#8217;ve chosen &#8220;all files unused for 3 months&#8221; as an example and considered what would happen if we applied this across all our storage.  SPACEWatch lets us compare this with a &#8220;do nothing&#8221; scenario to clearly see the benefit of the change &#8211; if users were to continue using storage the way they always have.  This example is interesting &#8211; not only is there clear and substantial savings, but it will take me a year or more to get back to the same level of storage use.  Storage growth also becomes more linear, indicating that historically the growth in storage taken up by unused files has actually been increasing!  Definitely time to act &#8211; and there&#8217;s the evidence to argue your case.</p>
<p>Why not download and try a free trial of <a title="SPACEWatch Storage Suite" href="http://www.sharoeware.com" target="_blank">SPACEWatch</a> now &#8211; it will work with any size of network.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/storagemanagement.wordpress.com/27/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/storagemanagement.wordpress.com/27/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/storagemanagement.wordpress.com/27/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=27&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://storagemanagement.wordpress.com/2011/09/05/building-your-business-case/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">g2-661f14d20e785bc44452f17a4b50157e</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/scenarios-chart.gif" medium="image">
			<media:title type="html">Planning for change with scenarios</media:title>
		</media:content>
	</item>
		<item>
		<title>Where did that file go?</title>
		<link>http://storagemanagement.wordpress.com/2011/09/04/where-did-that-file-go/</link>
		<comments>http://storagemanagement.wordpress.com/2011/09/04/where-did-that-file-go/#comments</comments>
		<pubDate>Sun, 04 Sep 2011 21:02:45 +0000</pubDate>
		<dc:creator>g2-661f14d20e785bc44452f17a4b50157e</dc:creator>
				<category><![CDATA[File search]]></category>
		<category><![CDATA[Scheduled tasks]]></category>

		<guid isPermaLink="false">http://storagemanagement.wordpress.com/?p=20</guid>
		<description><![CDATA[It used to be easy.  Back in the days of Windows XP you didn&#8217;t have many files.  And you kept them all on one, quite small, hard drive.  When you wanted to find a file you just clicked search.  After a short while (and if you remembered the name right) it would be listed for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=20&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://storagemanagement.files.wordpress.com/2011/09/xp-search.gif"><img class="alignright size-full wp-image-21" title="Searching with Windows XP" src="http://storagemanagement.files.wordpress.com/2011/09/xp-search.gif?w=197&#038;h=398" alt="Searching with Windows XP" width="197" height="398" /></a>It used to be easy.  Back in the days of Windows XP you didn&#8217;t have many files.  And you kept them all on one, quite small, hard drive.  When you wanted to find a file you just clicked search.  After a short while (and if you remembered the name right) it would be listed for you to use.</p>
<p>Don&#8217;t ask me what that dog was all about!</p>
<p>The rest, as they say, is history.  Now we have a Windows 7 search tool that depends on a heavy-weight indexing system.  Test your knowledge and prove that it really is easy to use (not my experience) &#8230; tell me how you start an advanced search?</p>
<p>Complexity isn&#8217;t the only issue &#8211; Microsoft have increased indexing sophistication because we now create so many more files.  No one worries about having enough disk space &#8211; they create backups and copies of copies and never clean up.</p>
<p>And if you use a network file server, who know where that file went that you were working on last month.  Perhaps its in that email I sent?</p>
<p>The future is worse &#8212; apparently we&#8217;ll all be using cloud file storage more and more in the future.  Cloud services are typically priced on data transfer and file access frequency, rather than just storage space consumed &#8211; so consider the challenge (translates as &#8220;cost&#8221;) of searching your cloud storage &#8211; ignoring any performance issues!</p>
<p>Microsoft wimped out of building a &#8220;proper&#8221; database-based file system into Windows&#8230; one day perhaps.  In the meantime you can use software like <a title="File search with SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch Storage Suite</a> to give you instant network-wide file search.</p>
<p>The screen shot below shows the main SPACEWatch File Finder window.  You can see the results of a search on the right, and a &#8220;visualisation&#8221; pane on the left.  The visualisation pane shows all the systems (servers etc) and users for whom results have been found.  Results can be filtered by clicking on one of these, or entering text into the filter box on the top right.</p>
<p><a href="http://storagemanagement.files.wordpress.com/2011/09/filefinderfull.gif"><img class="size-full wp-image-22 alignnone" title="File Finder - search across the network" src="http://storagemanagement.files.wordpress.com/2011/09/filefinderfull.gif?w=628&#038;h=378" alt="File Finder - search across the network" width="628" height="378" /></a></p>
<p>Note there are other panes available as tabs on the left and right borders of the windows.  These give access to the custom search options, for example.  Custom options include obvious things like file name, size, type, owner etc. but also more unusual options like whether or not the file has ever been used.</p>
<p>SPACEWatch lets you save even the most complex searches and re-use them with one click, or generate reports from them.  Other SPACEWatch users can re-use these searches as well.  Interestingly, how you sort the results is also saved &#8211; so, for example, you can set a limit on how many results to show and sort the results by size &#8211; and get a dynamic &#8220;top x&#8221; list.</p>
<p>SPACEWatch <em>does</em> use a proper database to store the file data it collects &#8211; so you can be assured that even the most complex searches, on the largest networks, produce results in seconds.  Don&#8217;t believe me?  Try downloading one of the <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">free trial versions</a> and see for yourself.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/storagemanagement.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/storagemanagement.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/storagemanagement.wordpress.com/20/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=20&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://storagemanagement.wordpress.com/2011/09/04/where-did-that-file-go/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">g2-661f14d20e785bc44452f17a4b50157e</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/xp-search.gif" medium="image">
			<media:title type="html">Searching with Windows XP</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/filefinderfull.gif" medium="image">
			<media:title type="html">File Finder - search across the network</media:title>
		</media:content>
	</item>
		<item>
		<title>The curse of PST files</title>
		<link>http://storagemanagement.wordpress.com/2011/09/04/the-curse-of-pst-files/</link>
		<comments>http://storagemanagement.wordpress.com/2011/09/04/the-curse-of-pst-files/#comments</comments>
		<pubDate>Sun, 04 Sep 2011 20:09:28 +0000</pubDate>
		<dc:creator>g2-661f14d20e785bc44452f17a4b50157e</dc:creator>
				<category><![CDATA[Cleaning up storage]]></category>
		<category><![CDATA[File types]]></category>
		<category><![CDATA[Microsoft Exchange]]></category>
		<category><![CDATA[MP3 files]]></category>
		<category><![CDATA[PST files]]></category>

		<guid isPermaLink="false">http://storagemanagement.wordpress.com/?p=15</guid>
		<description><![CDATA[Microsoft have never recommended that users store their Outlook data files on the network &#8211; performance can be bad and there&#8217;s always a danger of corruption.  However in return Exchange mailbox databases have traditionally been quite limited in size, so admins tend to set quite low limits on their users (interestingly, not an issue we [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=15&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Microsoft have never recommended that users store their Outlook data files on the network &#8211; performance can be bad and there&#8217;s always a danger of corruption.  However in return Exchange mailbox databases have traditionally been quite limited in size, so admins tend to set quite low limits on their users (interestingly, not an issue we see in Lotus Domino shops &#8211; where mail databases tend to grow and grow)&#8230; so users are forced to carry out local archiving to PST files.</p>
<p>This can sometimes be addressed with Exchange archiving solutions &#8211; but keeping local PST archives still gives users the easiest way to search and find what they want in old mailboxes.</p>
<p>So what to do?</p>
<p><a href="http://storagemanagement.files.wordpress.com/2011/09/types-pst.gif"><img class="alignleft size-full wp-image-16" title="PST files" src="http://storagemanagement.files.wordpress.com/2011/09/types-pst.gif?w=304&#038;h=219" alt="Finding PST files" width="304" height="219" /></a>A good start is to find out the scale of the problem.  Take a look at the screen shot on the left.  This is from <a title="Sharpeware SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch</a>&#8216;s File Types analysis tool.</p>
<p>This shows how PST files have been found right across the network &#8211; I can expand the tree to see how many files, and how much storage they&#8217;re consuming.</p>
<p>If I want to investigate further I right click and choose one of the context menu options such as listing the largest ones in a particular area of the network &#8211; or the least used.</p>
<p>But what about cleaning up all this potentially wasted space?  There are a number of approaches you can take, all of which SPACEWatch will help with, and many of which can be automated to run to a scheduled routine:</p>
<ul>
<li>Send users a list of the PSTs they own and ask them to check if they&#8217;re still needed &#8211; and remove them if they&#8217;re not.</li>
<li>Archive PST files that haven&#8217;t been used for a long time &#8211; typically leaving a &#8220;stub&#8221; that looks like the original file to any application, but redirects the application to the file&#8217;s new location (which could be cheap secondary storage).</li>
<li>Delete unused PST files.</li>
<li>Leave the PST files where they are, but archive file attachments within them (this is typically what takes up most of the file space): SPACEWatch with it&#8217;s Exchange add-on lets you do this on PST files, Exchange mailboxes and public folders.</li>
</ul>
<p><a href="http://storagemanagement.files.wordpress.com/2011/09/types-top-20.gif"><img class="aligncenter size-full wp-image-17" title="Top twenty file types" src="http://storagemanagement.files.wordpress.com/2011/09/types-top-20.gif?w=557&#038;h=306" alt="Top twenty file types" width="557" height="306" /></a></p>
<p>Perhaps if you follow some of these approaches, you <em>won&#8217;t</em> end up with a top twenty file types chart like this!  By the way, someone&#8217;s already been cleaning up on this network &#8211; you can tell, because &#8220;mp3&#8243; only just scrapes in to the top twenty.  Try this with <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch</a> on your network and I&#8217;ll bet its a lot higher!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/storagemanagement.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/storagemanagement.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/storagemanagement.wordpress.com/15/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=15&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://storagemanagement.wordpress.com/2011/09/04/the-curse-of-pst-files/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">g2-661f14d20e785bc44452f17a4b50157e</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/types-pst.gif" medium="image">
			<media:title type="html">PST files</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/types-top-20.gif" medium="image">
			<media:title type="html">Top twenty file types</media:title>
		</media:content>
	</item>
		<item>
		<title>Archiving file duplicates</title>
		<link>http://storagemanagement.wordpress.com/2011/09/04/archiving-file-duplicates/</link>
		<comments>http://storagemanagement.wordpress.com/2011/09/04/archiving-file-duplicates/#comments</comments>
		<pubDate>Sun, 04 Sep 2011 13:59:06 +0000</pubDate>
		<dc:creator>g2-661f14d20e785bc44452f17a4b50157e</dc:creator>
				<category><![CDATA[Archiving]]></category>
		<category><![CDATA[Cleaning up storage]]></category>
		<category><![CDATA[File duplicates]]></category>
		<category><![CDATA[Lotus Domino]]></category>
		<category><![CDATA[Microsoft Exchange]]></category>
		<category><![CDATA[Scheduled tasks]]></category>
		<category><![CDATA[duplicate file finder]]></category>
		<category><![CDATA[duplicate files]]></category>
		<category><![CDATA[find file duplicates]]></category>

		<guid isPermaLink="false">http://storagemanagement.wordpress.com/?p=7</guid>
		<description><![CDATA[Finding file duplicates across your network has been possible for a while, using tools like SPACEWatch Storage Suite from Sharpeware.  But what to do once you&#8217;ve discovered that there are thousands of them &#8211; and yes, they do take up an inordinate amount of your storage? An interesting new feature has just been added to SPACEWatch&#8217;s [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=7&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Finding file duplicates across your network has been possible for a while, using tools like <a title="SPACEWatch Storage Suite" href="http://www.sharpeware.com" target="_blank">SPACEWatch Storage Suite </a>from Sharpeware.  But what to do once you&#8217;ve discovered that there are thousands of them &#8211; and yes, they do take up an inordinate amount of your storage?</p>
<p>An interesting new feature has just been added to SPACEWatch&#8217;s &#8220;Enterprise Edition&#8221; &#8211; it lets you archive file duplicates.  This means you can choose what type of files to focus on (e.g. pesky audio files), find all the duplicates, and replace them with stubs pointing to just one of the copies &#8211; all in one task.  What&#8217;s more, once you&#8217;re happy with the way it works, you can schedule tasks to run automatically!</p>
<p>The screen shot below shows some file duplicates on a medium sized network, that we found using SPACEWatch.</p>
<p><a href="http://storagemanagement.files.wordpress.com/2011/09/filedups.gif"><img class="alignnone size-full wp-image-8" title="Finding file duplicates" src="http://storagemanagement.files.wordpress.com/2011/09/filedups.gif?w=628&#038;h=377" alt="Finding file duplicates" width="628" height="377" /></a></p>
<p>I&#8217;ve expaneded one of the duplicate file sets.  The green icons indicate files that have never been used (i.e. never opened since they were created).  Pink highlights files that haven&#8217;t been used for a long time.</p>
<p>Now we can manually carry out what Sharpeware call an &#8220;inline&#8221; archive, or a traditional archive.  Inline archive de-duplicates the file set by replacing all the files (except one that we choose) with stubs that redirect applications to the one we choose.  Traditional archiving replaces the files with stubs that redirect application to another location.  Both can be automated with the administrator tool that comes with SPACEWatch.</p>
<p>Not only is this an interesting &#8211; and easy &#8211; way to reduce storage use and waste, but it can be applied across your network.  Duplicates are found wherever they are &#8211; not just one PC or server.  In fact the search can be extended to the file attachments in email systems like Exchange and Lotus Domino as well.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/storagemanagement.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/storagemanagement.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/storagemanagement.wordpress.com/7/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=storagemanagement.wordpress.com&amp;blog=26938955&amp;post=7&amp;subd=storagemanagement&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://storagemanagement.wordpress.com/2011/09/04/archiving-file-duplicates/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">g2-661f14d20e785bc44452f17a4b50157e</media:title>
		</media:content>

		<media:content url="http://storagemanagement.files.wordpress.com/2011/09/filedups.gif" medium="image">
			<media:title type="html">Finding file duplicates</media:title>
		</media:content>
	</item>
	</channel>
</rss>
