<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../assets/xml/rss.xsl" media="all"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Design Meets Data (reproducibility)</title><link>http://www.designmeetsdata.com/</link><description></description><atom:link type="application/rss+xml" href="http://www.designmeetsdata.com/categories/reproducibility.xml" rel="self"></atom:link><language>en</language><lastBuildDate>Sun, 04 Oct 2015 06:15:39 GMT</lastBuildDate><generator>https://getnikola.com/</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>The Design Data List</title><link>http://www.designmeetsdata.com/posts/design-data-list.html</link><dc:creator>Mark Fuge</dc:creator><description>&lt;div&gt;&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;tl;dr:&lt;/strong&gt;
I've made &lt;a href="https://github.com/IDEALLab/design-data-list"&gt;a GitHub Repo&lt;/a&gt; for linking to known Design Data resources. Check it out and add to it if you or someone you know has design data somewhere.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;One theme to come out during &lt;a href="http://www.designmeetsdata.com/posts/idetc-2015-recap.html#dci"&gt;this year's IDETC Design Theory and Methodology session on Creativity and Ideation&lt;/a&gt; was the need for increased data sharing and reproducibility in our community. We do have a beginning culture of this, particularly in the CAD and Design Computing sub-fields. But cultural change takes time, and we have struggled to find a central place to share and store data like the Machine Learning community has managed to do with the &lt;a href="http://archive.ics.uci.edu/ml/"&gt;UCI Machine Learning Repository&lt;/a&gt; and &lt;a href="http://mloss.org/"&gt;MLoss&lt;/a&gt;. Jami Shah's group has valiantly tried to create a central repository through the &lt;a href="http://asudesign.asu.edu/protocol_repository/repository"&gt;ASU Design Protocol Repository&lt;/a&gt;, and Rob Stone's group has provided &lt;a href="http://function2.mime.oregonstate.edu:8080/view/index.jsp"&gt;a product repository&lt;/a&gt; for several years, but the practice of reproducibility and data sharing is not yet widely adopted.&lt;/p&gt;
&lt;p&gt;I think this is comes down to the perceived difference between benefits and costs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Benefits:&lt;/strong&gt; The benefits of sharing code and data aren't as well understood within our community as they are in other communities (Computer Science, Psychology, Economics, &lt;em&gt;etc.&lt;/em&gt;). This is something that will take time to educate the community about, and is not the purpose of this post (though this would a good topic for a future post).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Costs:&lt;/strong&gt; Sharing data or code is seen as arduous (which might or might not be true depending on how it is done). Even if one wanted to follow &lt;a href="http://opim.wharton.upenn.edu/~uws/"&gt;Uri Simonsohn's&lt;/a&gt; advice to &lt;a href="http://pss.sagepub.com/content/early/2013/08/23/0956797613480366.abstract"&gt;"Just Post It" &lt;/a&gt;, it's often unclear how or where to post your data/code to maximize impact and minimize work. Do you just post up a zip file on your website, transfer it to &lt;a href="http://asudesign.asu.edu/protocol_repository/repository"&gt;a central community repository&lt;/a&gt;,  or go full-in by publishing your entire workflow on something like the &lt;a href="https://osf.io/"&gt;Open Science Framework&lt;/a&gt;? This is what this post attempts to address.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;After chatting with folks at the past two IDETCs about this (notably &lt;a href="https://www.aburnap.com/"&gt;Alex Burnap&lt;/a&gt; and &lt;a href="https://engineering.purdue.edu/~ramani/"&gt;Karthik Ramani&lt;/a&gt;), I decided to address the Cost side of the equation by borrowing an idea I saw from trying to &lt;a href="https://github.com/josephmisiti/awesome-machine-learning"&gt;keep track of the ever evolving landscape of Machine Learning Libraries&lt;/a&gt;: rather than asking folks to pick some standard method of sharing their data, I created &lt;a href="https://github.com/IDEALLab/design-data-list"&gt;a GitHub Repo that just aggregates links to all of the design data sources that I know about&lt;/a&gt;. This works for me for several reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/IDEALLab/design-data-list/blob/master/CONTRIBUTING.md"&gt;Contributing is easy&lt;/a&gt;: it's just a set of text files. The original authors can add a link, or any 3rd party. Its completely open.&lt;/li&gt;
&lt;li&gt;Since it just provides links, researchers don't need to do anything fancy to share their data: researchers can chose to &lt;a href="http://pss.sagepub.com/content/early/2013/08/23/0956797613480366.abstract"&gt;just post it&lt;/a&gt; wherever is convenient, or they can use more formal mechanisms like &lt;a href="http://asudesign.asu.edu/protocol_repository/repository"&gt;central repositories&lt;/a&gt;. It's all treated equally; I (or anyone else) just needs to add a link to it.&lt;/li&gt;
&lt;li&gt;It's a low-effort way to highlight some of the great open-source work we do in our community, and to visualize our collective efforts in one place. In NSF parlance: it helps increase our works broader impact beyond just the papers we write.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.github.com"&gt;GitHub&lt;/a&gt;'s version tracking and commenting system give us a low-weight, but potentially useful means to have a discussion around the purpose and organization of the list. &lt;a href="https://help.github.com/articles/using-pull-requests/"&gt;Their pull request system&lt;/a&gt; means we can collaborate on it together and change it over time.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Will this be the permanent solution to bring reproducibility to the design community? I hope not. But it's a start; a way for folks to test the waters and establish a culture of reproducibility before trying their hands at something better-suited, like the &lt;a href="https://osf.io/"&gt;OSF&lt;/a&gt;.&lt;/p&gt;&lt;/div&gt;</description><category>idetc</category><category>reproducibility</category><guid>http://www.designmeetsdata.com/posts/design-data-list.html</guid><pubDate>Sat, 08 Aug 2015 18:02:00 GMT</pubDate></item></channel></rss>