<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/wordpress-mu-1.0" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>CDLINFO</title>
	<link>http://cdlinfo.cdlib.org</link>
	<description>California Digital Library News</description>
	<pubDate>Thu, 15 May 2008 19:40:21 +0000</pubDate>
	<generator>http://wordpress.org/?v=wordpress-mu-1.0</generator>
	<language>en</language>
			<item>
		<title>CDL Guidelines for Digital Objects, Version 2.0: Updated for METS File element</title>
		<link>http://cdlinfo.cdlib.org/blog/2007/11/15/cdl-guidelines-for-digital-objects-version-20-updated-for-mets-file-element/</link>
		<comments>http://cdlinfo.cdlib.org/blog/2007/11/15/cdl-guidelines-for-digital-objects-version-20-updated-for-mets-file-element/#comments</comments>
		<pubDate>Thu, 15 Nov 2007 19:55:55 +0000</pubDate>
		<dc:creator>raw</dc:creator>
		
		<category>Digital Preservation</category>

		<category>Technology</category>

		<category>Digital Special Collections</category>

		<guid isPermaLink="false">http://cdlinfo.cdlib.org/blog/2007/11/15/cdl-guidelines-for-digital-objects-version-20-updated-for-mets-file-element/</guid>
		<description><![CDATA[<p>The &#34;CDL Guidelines for Digital Objects,  Version 2.0&#34; (CDL GDO) has been updated to include specifications for use  of the METS File &#60;file&#62; element.&#160; </p>]]></description>
			<content:encoded><![CDATA[     <link rel="alternate" type="application/atom+xml" title="CDLINFO Category: Digital Preservation" href="http://cdlinfo.cdlib.org/blog/category/digital-preservation/feed/" />
     <link rel="alternate" type="application/atom+xml" title="CDLINFO Category: Technology" href="http://cdlinfo.cdlib.org/blog/category/technology/feed/" />
     <link rel="alternate" type="application/atom+xml" title="CDLINFO Category: Digital Special Collections" href="http://cdlinfo.cdlib.org/blog/category/digital-special-collections/feed/" />
<p>By Adrian  Turner, CDL Data Acquisitions<strong></strong></p>
<p>The  &quot;CDL Guidelines for Digital Objects, Version 2.0&quot; (CDL GDO) has been  updated to include specifications for use of the METS File &lt;file&gt;  element.&nbsp; You can find the updated  version at <a href="http://www.cdlib.org/inside/diglib/guidelines/">http://www.cdlib.org/inside/diglib/guidelines/</a> .</p>
<p>The  revision applies to Sections 2.1, 2.2.2, 3.1, and 3.2.4 only:</p>
<ul>
<li>To  support the orderly transmission and ingest of digital objects, the CDL  recommends the inclusion of checksum (MD5, SHA-1, or CRC32) and byte size  values in the METS File &lt;file&gt; element.&nbsp;  Note that this information is preferred, but not required.</li>
<li>The  subheadings within Sections 2.1 and 3.1 have been relabeled, and are now  consistently based on METS element names.
  </ul>
<p>Please  contact the CDL at oacops@ucop.edu if you have any questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://cdlinfo.cdlib.org/blog/2007/11/15/cdl-guidelines-for-digital-objects-version-20-updated-for-mets-file-element/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Digital Preservation News</title>
		<link>http://cdlinfo.cdlib.org/blog/2007/10/17/digital-preservation-news/</link>
		<comments>http://cdlinfo.cdlib.org/blog/2007/10/17/digital-preservation-news/#comments</comments>
		<pubDate>Wed, 17 Oct 2007 21:33:56 +0000</pubDate>
		<dc:creator>raw</dc:creator>
		
		<category>Digital Preservation</category>

		<category>Digital Publishing</category>

		<guid isPermaLink="false">http://cdlinfo.cdlib.org/blog/2007/10/17/digital-preservation-news/</guid>
		<description><![CDATA[<p>The CDL Digital Preservation Group has been busy with a variety of exciting activities.</p>]]></description>
			<content:encoded><![CDATA[<p>By Trisha  Cruse, CDL Director of Digital Preservation</p>
<p>The CDL  Digital Preservation Group has been busy with a variety of exciting activities,  reported below. </p>
<p><strong>Release 4 of the Web Archiving Service</strong><br />
  On September 18th the Web Archiving Group released a new version of the  Web Archiving Service &ndash; special thanks to Tracy   Seneca, Scott Fisher,  Margaret Low, Erik Hetzner, Mark Reyes, and Mike Wooldridge for getting this  release out the door.&nbsp; So far the group has received very positive feedback  from users on the service&rsquo;s functionality and the user interface.&nbsp; We are  also extremely pleased with the performance; we are up to 500 captures with  relatively few hiccups.</p>
<p>We have also put together an overview of the service that is available on YouTube &lt;http://tinyurl.com/2tdrwq<strong>&gt;</strong>.&nbsp; This brief overview explains why the content targeted for this project is at  risk, how we plan to address this in the Web Archiving Service, and provides an explanation of the collections our curators are working on. Warning: the  YouTube video quality is a bit sketchy so we have also made this presentation  available in a high-quality video format; contact tracy.seneca at ucop dot edu  for further information. </p>
<p><strong>A kinder  and gentler ARK  page</strong><br />
 Thanks to  Kirsten Neilsen and John Kunze there is now a kinder, gentler introduction to ARK identifiers on Inside CDL &lt;<a href="http://www.cdlib.org/inside/diglib/ark/" title="http://www.cdlib.org/inside/diglib/ark/">http://www.cdlib.org/inside/diglib/ark/</a>&gt;.&nbsp; Don&rsquo;t know what that is?&nbsp; Then definitely take a look.&nbsp; Our hope is that this will help others  recognize and appreciate the true beauty and splendor of ARKs. &nbsp;The new  page has already been re-purposed in a German &quot;technology watch&quot;  newsletter, &lt;<a href="http://www.kim-forum.org/techwatch/kim-dini-technology-watch-report1_2007.pdf" title="http://www.kim-forum.org/techwatch/kim-dini-technology-watch-report1_2007.pdf">http://www.kim-forum.org/techwatch/kim-dini-technology-watch-report1_2007.pdf</a>&gt;  which is the very first edition of a bi-annual publication from the  Interoperable Metadata Center for Excellence and the German Networked  Information Initiative.</p>
<p><strong>Tidal  wave of web data knocking on our door</strong><br />
 For the  past several years the Digital Preservation group has been working with Andreas  Paepcke and Hector Garcia-Molina at Stanford   University on web  crawling activities.&nbsp; Their research group has a wealth of experience  collecting web data and while CDL&rsquo;s Digital Preservation group was getting their  &ldquo;web crawling sea legs&rdquo; they asked Stanford&rsquo;s group to collect data on our  behalf.&nbsp; Over the years Stanford has collected over 100 TB of data ranging  from dot.gov sites, election data, Katrina,   Virginia Tech tragedy, etc.&nbsp;  However, they have been using a different crawler than the Web Archiving  Service (WAS) crawler (Heritrix).&nbsp; As a consequence their crawler output  is incompatible with most web archiving services, including ours.&nbsp; However, there is good news &#8212; they have recently created a tool that will turn  the output of their crawler data into something that CDL&#8217;s service can  understand.&nbsp; Erik Hetzner, Mike Wooldridge, and Scott Fisher are just  beginning to play around with this, but we are hoping for a positive outcome.</p>
<p><strong>Contributing  to the community by documenting Heritrix</strong><br />
 As  mentioned above, our Web Archiving Service uses Heritrix, the Internet  Archive&#8217;s (IA) open-source, extensible, web-scale, archival-quality web crawler  project.&nbsp; &quot;Heritrix&quot; (often misspelled heretrix, heratrix,  heritix, etc.) is an archaic word for &quot;heiress&quot;, which the IA chose because the project seeks to collect and preserve the digital artifacts of our  culture for the benefit of future researchers and generations.&nbsp; One of the  challenges of using Heritrix is that there is a dearth of documentation.&nbsp; Over the next several months Hunter Stern, CDL&#8217;s technical writer, will be working with Heritrix programmers at CDL and IA to better document the crawler.&nbsp; This collaboration will help us tremendously and benefit the  crawler community as well.</p>
<p><strong>Moving  big data: Mass Transit Project</strong><br />
Over the  past couple of years the Digital Preservation Group has been working with the  campuses to move large chunks of content into the Digital Preservation  Repository (DPR).&nbsp; In the process we have encountered a few speed bumps along the way. The issues are two-fold but related: the files are large and the  network transfer rates have been unaccountably slow.&nbsp; Though we have worked  towards resolving this, we have more work to do in understanding the best  transfer tools and in monitoring our networks to make sure there are no log  jams and that they are ready to be used to their full potential  bandwidth.&nbsp; The goal is to make sure we&#8217;re making the best use of our Internet2 pathways to/from the campuses and the data centers for the benefit of  all CDL projects.</p>
<p>The Digital  Preservation group has embarked on two efforts to speed up movement of large  files into the DPR. &nbsp;First, they are collaborating with San Diego  Supercomputer Center (SDSC) to understand how to transfer data across the  network more quickly and efficiently.&nbsp; Second, they are implementing (on a trial  basis) a method of pulling in large numbers of external data objects into a  kind of preservation holding tank in order to reduce the impact of network  speed and latency on the overall DPR ingest process.&nbsp; They are very excited about the collaboration  with SDSC and Kirsten Neilsen will be leading the project for CDL &ndash; we&rsquo;re  calling the project &ldquo;Mass Transit&rdquo; and there is a project Wiki &lt;http://masstransit.sdsc.edu/&gt;. </p>
<p>If you want  any additional information on any of these projects please contact Trisha Cruse  (patricia.cruse@ucop.edu). </p>
]]></content:encoded>
			<wfw:commentRss>http://cdlinfo.cdlib.org/blog/2007/10/17/digital-preservation-news/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CDL Guidelines for Digital Objects, Version 2.0:  Updated requirements for METS unique identifiers</title>
		<link>http://cdlinfo.cdlib.org/blog/2007/07/17/cdl-guidelines-for-digital-objects-version-20-updated-requirements-for-mets-unique-identifiers/</link>
		<comments>http://cdlinfo.cdlib.org/blog/2007/07/17/cdl-guidelines-for-digital-objects-version-20-updated-requirements-for-mets-unique-identifiers/#comments</comments>
		<pubDate>Tue, 17 Jul 2007 16:46:17 +0000</pubDate>
		<dc:creator>raw</dc:creator>
		
		<category>Digital Preservation</category>

		<category>Technology</category>

		<category>Digital Special Collections</category>

		<guid isPermaLink="false">http://cdlinfo.cdlib.org/blog/2007/07/17/cdl-guidelines-for-digital-objects-version-20-updated-requirements-for-mets-unique-identifiers/</guid>
		<description><![CDATA[<p>The &#34;CDL Guidelines for Digital Objects, Version 2.0&#34; (CDL GDO) has been updated to reflect modified requirements for METS unique identifiers.&#160; </p>]]></description>
			<content:encoded><![CDATA[<p>By Adrian Turner, CDL Data Acquisitions consultant</p>
<p>The &quot;CDL Guidelines for Digital Objects, Version 2.0&quot; (CDL GDO) has been updated to reflect modified requirements for METS unique identifiers.&nbsp; You can find the updated version at <a href="http://www.cdlib.org/inside/diglib/guidelines/">http://www.cdlib.org/inside/diglib/guidelines/</a> .</p>
<p>The revision applies to Section 3.1 only, and pertains to objects submitted for the CDL&#8217;s &quot;Enhanced Service Level&rdquo;.&nbsp; This service level encompasses the presentation of digital assets via CDL websites. It is also sufficient for increased preservation services in the UC Libraries Digital Preservation Repository.</p>
<p>The METS top-level &lt;mets&gt; tag must contain an OBJID attribute containing an ARK identifier for the digital object.&nbsp; Previously, the CDL GDO indicated that the OBJID attribute could contain a unique local identifier in lieu of an ARK identifier.&nbsp; CDL systems do not support this scenario, however, for objects submitted for the Enhanced Service Level only.</p>
<p>Please contact the CDL at <a href="http://www.cdlib.org/inside/feedback/">http://www.cdlib.org/inside/feedback/</a> if you have any questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://cdlinfo.cdlib.org/blog/2007/07/17/cdl-guidelines-for-digital-objects-version-20-updated-requirements-for-mets-unique-identifiers/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Digital Preservation Program Update</title>
		<link>http://cdlinfo.cdlib.org/blog/2007/05/23/digital-preservation-program-update/</link>
		<comments>http://cdlinfo.cdlib.org/blog/2007/05/23/digital-preservation-program-update/#comments</comments>
		<pubDate>Wed, 23 May 2007 18:08:24 +0000</pubDate>
		<dc:creator>raw</dc:creator>
		
		<category>Digital Preservation</category>

		<guid isPermaLink="false">http://cdlinfo.cdlib.org/blog/2007/05/23/digital-preservation-program-update/</guid>
		<description><![CDATA[<p>This article provides a status report on the various projects of the Digital Preservation Program including the current status of the Digital Preservation Repository (DPR), information on the next release of the Web Archiving Service (WAS) and on NOID (Nice Opaque Identifier) software.</P]]></description>
			<content:encoded><![CDATA[<p>By Kirsten Neilsen, Digital Preservation Service Manager</p>
<p><strong>Digital Preservation Repository (DPR)</strong><br />
  The Digital Preservation Repository (DPR) provides the UC Libraries with a shared solution for the preservation, management, and controlled dissemination of digital collections.</p>
<p>To date, UC Libraries have successfully moved about 250 GB &ndash; more than 55,000 objects &ndash; into the production DPR environment, with several projects on deck. Thus far objects ingested have been predominantly image and text files, but DPR can ingest video and audio files as well.</p>
<p>With core ingest, storage, and management functionality in production, the Digital Preservation Group is developing additional preservation services, such as remote data replication, and enhancing reporting functionality. Research into data storage and data transfer, issues central to digital preservation, is ongoing.</p>
<p><strong>Web-at-Risk Update</strong><br />
  In collaboration with archivists and librarians from a number of UC (and other) libraries, the Web-at-Risk program is developing the Web Archiving Service, a set of tools for capturing and preserving at-risk materials from the web. Development of the service proceeds in a series of phased pilot tests. During each pilot release, the project&rsquo;s curators test functionality and suggest improvements. Feedback from curators is incorporated into the subsequent releases.</p>
<p>Development of the Web Archiving Service (WAS) is progressing toward the 4th of 7 releases, scheduled for July. The upcoming release includes collection building features that allow curators to selectively add captured web content to a thematic collection.&nbsp; The release will also include website change analysis tools to help curators identify files on a site that have changed or that are new. The Web-at-Risk curators, a group of approximately 30 UC, Stanford, NYU and University of North Texas government information specialists, will be meeting in Oakland at the end of May.&nbsp; A smaller group of curators will be taking part in usability testing sessions on the new WAS interface.&nbsp; </p>
<p>The most recent WAS release took place in January and included the ability to better analyze capture results and to explore results by file type.&nbsp; The analysis of that pilot test is complete, and is posted on the Web-at-Risk wiki: <a href="http://wiki.cdlib.org/WebAtRisk/">http://wiki.cdlib.org/WebAtRisk/</a> .</p>
<p>The Web-at-Risk program recently received additional funding from the National Digital Information Infrastructure Preservation Program to explore end user access to web archives.</p>
<p><strong>NOID (Nice Opaque Identifier): Minter and Name Resolver</strong><br />
  A new Inside CDL page (<a href="http://www.cdlib.org/inside/diglib/noid/">http://www.cdlib.org/inside/diglib/noid/</a>) contains a brief discussion of opaque identifiers, persistence, and name resolution as a way of introducing NOID, software created at CDL to provide part of the solution to the problem of persistent identifiers.<br />
  &nbsp; <br />
  For information, contact: Kirsten Neilsen, Digital Preservation Service Manager<br />
  510-987-0456 <a href="mailto:kneilsen@ucop.edu">kneilsen@ucop.edu</a> </p>
]]></content:encoded>
			<wfw:commentRss>http://cdlinfo.cdlib.org/blog/2007/05/23/digital-preservation-program-update/feed/</wfw:commentRss>
		</item>
		<item>
		<title>CDL Guidelines for Digital Objects, Version 2.0 — available online</title>
		<link>http://cdlinfo.cdlib.org/blog/2007/02/01/cdl-guidelines-for-digital-objects-version-20-%e2%80%94-available-online/</link>
		<comments>http://cdlinfo.cdlib.org/blog/2007/02/01/cdl-guidelines-for-digital-objects-version-20-%e2%80%94-available-online/#comments</comments>
		<pubDate>Thu, 01 Feb 2007 20:00:52 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category>General</category>

		<category>Digital Preservation</category>

		<category>Technology</category>

		<guid isPermaLink="false">http://cdlinfo.cdlib.org/blog/2007/02/01/cdl-guidelines-for-digital-objects-version-20-%e2%80%94-available-online/</guid>
		<description><![CDATA[The CDL and Digital Library Services Advisory Group (DLSAG) are pleased to announce the release of the final version of the CDL Guidelines for Digital Objects (CDL GDO), Version 2.0. The guidelines are available in HTML and PDF format at the following URL:
http://www.cdlib.org/inside/diglib/guidelines/
Digital materials of ever-increasing variety and complexity are seen to be worth collecting [...]]]></description>
			<content:encoded><![CDATA[<p>The CDL and Digital Library Services Advisory Group (DLSAG) are pleased to announce the release of the final version of the CDL Guidelines for Digital Objects (CDL GDO), Version 2.0. The guidelines are available in HTML and PDF format at the following URL:</p>
<p><a href="http://www.cdlib.org/inside/diglib/guidelines/">http://www.cdlib.org/inside/diglib/guidelines/</a></p>
<p>Digital materials of ever-increasing variety and complexity are seen to be worth collecting and preserving by memory organizations — libraries, archives, museums, etc. Materials include objects converted into digital form from existing collections such as manuscripts, maps, visual images, and sound files, as well as &#8220;born digital&#8221; materials such as web sites.</p>
<p>In order for the CDL to provide effective preservation and access services, these materials need to be represented in a uniform manner.  The CDL GDO provides specifications for all new digital objects prepared by institutions for submission to the CDL. It is based upon and supersedes the &#8220;CDL Digital Object Standard, Version 1.0&#8243; (May 2001) and the &#8220;OAC Best Practice Guidelines for Digital Objects, Version 1.1&#8243; (January 2004).</p>
<p>The CDL GDO includes the following features:</p>
<ul>
<li>Establishes &#8220;sliding scale&#8221; requirements, i.e., the more a digital object conforms to the guidelines, the more preservation and access services can be provided for it.</li>
<li>Provides specifications for preparing digital objects, comprising metadata and content files (e.g., digital images, text) packaged using the Metadata Encoding and Transmission Standard (METS) format.</li>
<li>Includes updated recommendations for digital image files.</li>
</ul>
<p>A draft version of the guidelines was prepared from the fall of 2004 through the winter 2005. Feedback received from CDL contributing institutions was incorporated into this final version of the guidelines.</p>
]]></content:encoded>
			<wfw:commentRss>http://cdlinfo.cdlib.org/blog/2007/02/01/cdl-guidelines-for-digital-objects-version-20-%e2%80%94-available-online/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
