This document is in the Public Domain.
OKFN Conference
tag = OKCON
or paste the links here
- OKFN built demonstrator to show whereealth, education, public, economic affairs, defence, public safety, social protection...etc
- money tracking via data is only skin deep, in the billions, not able to easily track in the millions.
- The 3 S's = Search, Storage and Services
- these three are the infrastratructure for our scoiety
Speaker
Chris Taggert
Soundbites
They don't want to be transparent in practice
Local data
A mess
Sporadically published by central government
Inacessible council websites
Opaque local bodies
Uncertain legal status
Start with basics
Who are the councilloors
Where do they represent
etc
Background
Inspiraction from Manchester project
MCC works for you
Screen scrapes council websites
Make it available as linked data
RDF
JSON
XML
Where next
More
Data
Councils
Connections
Others
Elections
Police
etc
Some things can;t be done programmatically
Need crowdsourcing
Visulizations
Why does this matter
Transparency
See and understand what;s going on
Engagement
Remove barriers
Equality of access
Much of this data is already available
For a price
Efficiency
A better way
Open data we can all consume
Simple questions that are hard to answer
How does the budget of my council compare with others
What are the backgrounds of my councillors
What are the relationships between councillors and major providors
Is this displayed on the register of interests?
Exemplars
Transparency
Private Eye: Rotten Boroughs
East riding yorkshire council
Had Audit
Audit BURIED on website
Nothing on council website
Nothing on audit commission website
Now have a DUTY to engage
Problems
IDs
Data tied up in pdfs
more
Open election project
Tackling the open local data problem
succeed or fail forward
Fail forward?
Learning from mistakes
Publish election data with RDF semantic markup
Consumed by RDF readers
URIs as identifiers
REALLY important
Where next
Freeedom of data act
Change how gvnmt uses it
Central
Local
Change relationship between gvnmt and citizen
Support new business model
Hyperlocal sites
suppliers
innovative
supportive
Develop the meme
Enabler
Blocker
Peter Murray-Rust - Open Data in Science
- Meta: "Power corrupts; Powerpoint corrupts absolutely (Tufte)"
- Why Open Data is essential
- Software as an agent of revolution (liberation software
- "linked data without openness is crippled"
- Have put in a JISC bid to open bibliographic data
- Climate change example - lack of open data and open software
- "as we are going under [due to climate change], at least we didn't violate copyright!"
- copyright should not be used for political control
- PMR is blaming agents of control: academic research librarians, non-profit and for-profit publishers - "agents of control" for scientific information
- Chemistry - "most reactionary of the physical sciences"
- software is controlled by commercial interests
- American Chemical Society lobbied National Institutes of Health to shut down PubChem
- datasets linked in the world from W3C
- biology datasets not formally open - can't reuse them without offending some restriction
- might just be they haven't said they can
- collects together crystallography data - 130,000+ structures
- Greasemonkey plugin for Inorg Chem journal [presumably to link togetehr keywords in J articles to entries -Tom]
- "capture the latest crystallography from the web and to republish it on the web [with semantics]"
- Acta Crystallographica - partner
- open source software from Blue Obelisk to render 3D and 2D chem structures
- launched about a month or so ago
- how do we make this problem solveable?
- show scientists legal contracts and they turn off - want to get back to their labs
- named after pub in Cambridge
- [heh, I didn't know we had a CZ article on that. Need more pubs. -Tom]
- make explicit and robust statement of your wishes
- BMC journals endorse open data - chemistrycentral, j chemoinformatics
- link to it from your site
- Inspired by whatdotheyknow
- ask questions of any data provider for their open provider
- changing community norms - you are expected to publish in science
- CRU e-mail hack - a lot of the data was closed due to agreements between research groups. Select Committee said all data should be open, but the research funding bodies make it closed! What leverage can we get on funding bodies to change this?
- funding bodies aren't the problem. funding bodies in UK say data should be open.
- condition of the grant that you publish your data.
- relatively few funding bodies are against openness.
- problem will disappear over the next few years.
Sören Auer (Univeritat Leipzig Research Group AKSW) - Linked Open Data - a technology facilitating Open Knowledge
Sits on coucil of OKFN, co-founder of DBPedia
technical stuff about linked data.
- -tool ecosystem - DXX< SILK, SemMF, poolparty, ontowiki, sigma, ORE, dl-learner, sindice, monetdb, virtuoso, WiQA
- create->interlink->fuse->classify->enrich->repair->create->...
- popular datatypes mapped out, but specialised ones aren't. = long tail of information domains
- RDF 101: entities, triples, content-negotiation
- State of the LInked Data Nation
- achievements: data commons, community, industrial uptake, THE vision
- "Linked data is the data layer for open knowledge on the web"
- Ben O'Steen: semantic pingback - what about spam? how are you going to combat spam triples?
- compare pingback and trackback - with pingback, you go and check whether it actually is linking back.
- Ben O'Steen: but it is difficult! you need trusted networks
Linked Open Data: from Ant Beck's mindmap:
Speaker
Soren Auer
Leipzig
Soundbites
Vision of Linked open data
ecosystem of data
Heterogeneous
Linked
Allows
Enriching
Repairing
Classifying
Why linked data
Important for communities
Science
Citizens
If web pages are generated from structured database
Then link the structured databases
Goal to become more decentralised and focus on data
Problems
Support for complex objects is hard
Support for popular objects is easy
Pictures
Music
What is linked data
Uses RDF data as a graph
Can be serialised as triples
Subject
OKFN
Predicate
Organises
Object
OKcon2012
COmputers negotiate between triple stores
State of the linked Data Nation
30 billion triple statements
Achievements
Extension thanks to data commons
Vibrant global community
Industrial uptake
Emerging gvnmnt adoption
UK is leader
Establishing Linked data as the VISION
Challenges
Coherence
Quality
Performance
Against relational
Data consumption
Large scale processing
Schema mapping
Data fusion
Usability
Update and notify after link established
Downward compatible with PingBack
On blogs
Linked data a technical layer for open knowledge
open, standrads based
[^^ who wants to diff/merge this? we need distributed version control for etherpad! ;) -tom]
State of the Open Data Nation (w/ a slight legal perspective)
Jordan S. Hatcher, www.jordanhatcher.com
Director, Open Knoweldge Foundation
- opendatacommons.org licenses - ODbL license, like CC or FSF GPL for data
- being used by OpenStreetMap - CC BY-SA transition to ODbL
- We need to think about data privacy: gaydar, Netflix,
- How to become a project w/ OKFN? <- a form to fill out and informally reviewed by OKFN board?
- Various licenses for software (OSS), content (CC) and data (ODbL)
Helen Turvey, Shuttleworth Foundation
- Fellowship programme for dynamic leader to champion your work
- org provides legal, financial and administrative burden
- equity stake is open, transparent and exposed
Glyn Moody
Author of 'Rebel Code'
Board member of OKFN
- nature of "want to share"
- SHARING IS UNDER ATTACK - "war on sharing"
- Anti Counterfeiting Trade Agreement (ACTA)
- "battle between the old analogue world and the digital"
- We must be active in shaping the laws
- digital data invented by nature -DNA
- [Dawkins has that great analogy about trees with floppy disks hanging off each branch]
- The fundamental shift where the cost of information will be absolute zero.
- The real threat is *not* piracy, it is OBSCURITY
- NIN example of selling exclusive physical versions of songs
- Jill Sobule donation model <- you can pay to sing on her CD!
- The world of digital abundance is inevitable.
- The only way to stop it is to stop people sharing.
- The interesting question is how to make money from the free stuff.
- Repository Manager asking about how to archive research?
LUNCH
Ideas and Culture Session chaired by Bill Thompson
Data Spheres, Adnan Hadzi Goldsmiths: University of London/Deptford.TV
- Vision of open media broadcasting and open content remixing.
- Cinelerra video editting and server
- film: voices of the voiceless film
- screening on the 'mindsweeper pirate vote' <- burnt down :(
- carrying a receiver to pick up CCTV videos
- Letters create the 19th century social and cultural graphs
- we can get an insight into: who the author was writing to; who the author was writing to; who the author was writing about; etc.
- Transitioning from PHP → Python
- HTML, XML and RDF representations of letters
- beginning to visualise letters as data
- FOAF + DC + ??? <- new namespaces, e.g. text schema ontology?
- looking for more effort in project?
- how might you use this? / easier for scholars to find where patterns might emerge in letter writing, e.g. to/from, return all letters to A from B mentioning Z.
- James Harriman-Smith - The Marriage of Text and Technology: An introduction to Open Shakespeare
- tools for searching and analysing shakespeare at first
- moving towards annotation tools and translation tools
- We don't have original texts, so there are differences.
- "oh that this too too solid/sallied/sullied flesh would melt"
- Looking to integrate with other criticism for annotations
- wordhoard <- license doesn't allow it.
- vandalism or disinformation is still a problem.
- re printing shakespeare editions with annotation <- tangilisation!
- annotation of texts that do not have sources? / we need to have new models for annotation.
- linking this with other literary texts? / yes, intertextual analysis accross a canon of texts?
- cambridge faculty, extend to school teachers, etc.
- Ben O'Steen - Making the Physical from the Digital
- Background: Dealing with libraries
- job is dealing with Word :(
- Your interaction with the format and medium *matters*
- do we need faithful representations?
- What about real things, can they seamlessy be moved back and forth between physical and digital?
- 'makers' book is open and reformatable
- book reformatted as a til receipt
- making your own books that were born digital, e.g. blogs <-- how do you print them?
- divide up pages and seperate into page formats
Open Bibliography session: chaired by Jonathan Gray
The Itinerant Poetry Library, Sara Wingate Gray
- The Library as a growing organism - Ranganthan's 5 Laws of Library Science
- need to recognise user community much more, library patrons as co-curators, the future of libraries is participative, flexible, open-ended, community and sharing based. Principles of public domain need to be applied and advocated for much more in this arena or a digital land-grab will ensue which will make us all losers in the long term.
Journal Commons: Open Process Academic Publishing in Practice, Toni Prug & Juan Grigera, School of Business & Management, Queen Mary: University of London
OpenCitations.net - publishing bibliographic citations as Open Linked Data using CiTO (Citation Typing Ontology), David Shotton, University of Oxford
- Adding semantics to a PLoS XML paper.
- PLoS does not actually link to cited papers, just provide a search tool to assist -- too many clicks!
- Speaker hopes PLoS will use it or similar to semantically indicate citation relationships
- CiTO includes *why* the item was cited
- or <bar> isCitiedBy <foo>
- Setting up a triple store to describe citations
- data from UK Pubmed, PLoS biomed, CrossRef & Soton EPrints-
Using the Institutional Repository to publish research data, Christopher Gutteridge, University of Southampton/EPrints Project
Open Data and the Semantic Web
Utilizing, creating and publishing Linked Open Data with the Thesaurus Management Tool 'Pool Party', Thomas Schandl punkt.net Services
Utilizing, creating and publishing Linked Open Data with the Thesaurus Management Tool 'Pool Party
Speaker
Thomas Schandl
punkt.net Services
Soundbites
My view
Quite intuitive GUI
Poolparty overview
SKOS thesaurus management system
Use case
Integrate data
Semantic search
Tag recommendation
Autocomplete
Ease of use
For non sem web expoert
Linked open data
Consumes
Produces
Plugs into enterprise architecture
Using Rest
SKOS overview using poolparty
Uses concepts instead of keywords
Graph visualization
Add value through use of LinkedData and DBpedia
Term enhancement
You can browse the links using PoolParty
Towards a Korean DBpedia and an Approach for Complementing the Korean Wikipedia based on DBpedia, Eun-kyung Kim, Matthias Weidl, Key-Sun Choi, Sören Auer, KAIST & Universität Leipzig
Speaker
Eun-kyung Kim
Matthias Weidl
Key-Sun Choi
Soren Auer
KAIST
Universität Leipzig
Overview
Technical overview about the implementation of a Korean version of DBPedia
Interesting to see the difference in implementation
Heavily image driven for the technical detail
Get the slideshare version
Stucture in wikipedia
Title
Abstract
Wikipedia Infoboxes
Are like triples
Geo co-ordinates
Categories
Images
Links
Other language versios
Other pages
Etc
How does it work
Infobox
Reasonably simple
Slide explains this
TripleStore
Open Government Data on a Large Scale: The Challenges, Jeni Tennison (TBC)
Raw data... now what
Open data
Free info
PDF
No
Raw data
Excel
No
Getting people to publish information because they want to
Bottom up
Motivate sustainable publication
Dumps and Slices
Programmatic access
Responsible Publishing
Context is everything
Link to
Caveats
Code description
DETAILS
Provenance and versioning
Exemplars
Patterns
Guidance
Providing Identifiers
Persistent URIs
RDF
Allows the easy merging of data
Supports distributed publishing
Above is why RDF is more appropriate than
csv
json
xml
Flexible consumption
Generated configerable API over a SPARQL endpoint
Means you don't have to learn SPARQL
List and search results
Search through URI parameters
Configureable API over linked data
Approachable formats
Generates simple
JSON
XML
Configureable for other formats
Challenges
Centralised versus distributed
Assignment of URIs
Publication of data
Rules versus guidance
Help without taking control
Trusting in the web
Open Data and Local Councils, Stuart Harrison, Lichfield District Council
Speaker
Stuart Harrison
Lichfield District Council
Lichfield district
North of B'Ham
Two urban centres
Mainly older population
Younder peaks in urban centres
Exemplars
RateMyPlace
First Lichfield site
2005/6
Food safety scores
Built completely in house
Commercial expensive
Opened the data
Restful API
Returns data
XML
JSON
Postcode
Details
Results
Built Java Script Widgets
Exposed more council data
Including Geo Data
OS issues
What are the barriers
Lack of awareness
Personal issues
Fear
uncertainty
doubt
Tools
Open Elections Project
Aims to overcome some of these barriers
Standards based
RDFa
Minimal technical knowledge
Minimal cost
Spatial Data
OS derived data
Can't be re-used by third parties
Massive issues with google maps
Ongoing discussion with Cabinet Office
Why Open Data
Allows us to engage with a different audience
Makes engagement a many to many arrangement
Failure for free
It's going to happen anyway
COMMUNITY DRIVEN RESEARCH
Collaborative Structuring of Knowledge by Experts and the Public, Tom Morris (and Daniel Mietchen), Citizendium
- Like Wikipedia but with editors and citation names of people who write articles.
- goal of 100k by 2012; not realistic at present
- "the crank problem" <- fakers; from the paper: "hard to handle from an editorial perspective because those willing to invest their time on the topics are usually heavily biased in their approach, and most of those capable of evidence-based comment prefer not to contribute to these topics."
Dig the new breed: how open approaches can empower archaeologists, Anthony Beck
Talk
What is archaeology
Machining - Topsoil stripping
Cleaning - shovel scraping
Excavating
Interpreting
Recording
Data
Local
Regional
Landscape
Hermenuitic knowledge cycle
Data
Theory
Practice
Archaeology as human ecology
Relationship between
Flora
Fauna
Landscape
Climate
People
Objects
As expressed in the archaeological record
Problem
Siloed Data
Primary data
Excavation records
Remote sensing transcriptions
NMP
Lab Analysis
Specialist reports
Decoupled Synthetic Data
Site reports (grey literature)
SMRs
NMRs
Implications
No synergy
Cripples the knowledge frameworks
Less effective
Research
Policy
Impact
Open approaches
Open Data
Open Access
Open Science
Open Policies
Exemplars
Grey Literature
and grey data. the data is inside the 'grey' literature - unpublished literature that floats around between researchers
Archaeology units conduct most excavations in the UK
Problem
Predominantly paper based recording (still)
Primary record is difficult to access digitally
Excavations are written up as site reports (interpretative and data summary)
These reports are not published: hence Grey Literature
Solution
Prof. Richard Bradley – Reading Uni
Visited contract units
Collated ‘grey literature’
Transformed theory and interpretative frameworks about Bronze Age settlement patterns and dynamics
Transformed
Frameworks
Theory
Interpretation
Understanding
Bronze age
Settlement patterns
Dynamics
INSPIRE
INSPIRE (INfrastructure for SPatial InfoRmation in Europe)
EU Directive for a general framework describing Spatial Data Infrastructure (SDI).
Designed to facilitate European wide sharing of spatial information:
public sector organisations
public access
Improve decision making
Improve policy
Advocate a schema based implementation
INSPIRE applies to data held by public institutions
This includes heritage data. Nominally:
Decoupled synthesis:
Sites and Monuments Records
National Monuments Records
National Mapping Programme (national AP dataset)
Approaches
INSPIRE conceptual approach
Lossy
Employs decoupled, synthesised and generalised data
Does not change to re-interpretation of the underlying source data
Requires digital data
the syntheses exist
Dynamic - source data updates but doesn’t change
Linked Data conceptual approach
Flexible
Generic
Requires digital data
most primary data is not digital
Dynamic - can change to reflect the source data
Not an either/or approach. BOTH open up heritage data
Linked Dynamic Data
Archaeological knowledge acquisition is a dynamic process
Dynamic feedback allows theories/practice to be tested or revised
Pottery Sequences
Pottery is important for dating sites and deposits
Classification based on form and fabric variations
Dates derived from stratified sequences (e.g. wells)
Pottery sequences developed locally and integrated –
Regionally
Nationally
Pottery classification
Periodically sequences are reviewed
Clumping (owl:sameas)
Splitting
Refining date ranges
Date changes impact on:
Interpretation
Significance
Policy
Think “Grey literature” but bigger!
Unfortunately the data is decoupled and not linked. The primary and synthetic data is never/rarely re-interpreted
DART
What are the best ways to employ the different sensors (a multi-sensor approach) for the greatest heritage return?
In particular how do we improve the use of different sensors in regional/national prospection programmes?
What are the best conditions (e.g. environmental, seasonal, weather, crop) for deployment?
Why open science
25 heritage, industry and academic partners
The best way to keep everyone informed is to adopt an open science philosophy
Wherever practicable all data will be in the public domain as soon as possible in accessible repositories
If successful SHARE-ME (a companion JISC bid) will simplify metadata generation, RDF generation and deposition
RDF data will be maintained with Talis.
Will allow the science to be dynamically shared with colleagues throughout the world
Improve the scientific process
Improve results
Improve impact
More WOW moments
16.05 /end
Clear Climate Code – Our Role In Global Warming,
David Jones and Nick Barnes, Clear Climate Code - Ravenbrook Limited (Cambridge)
16.05pm start
- GisTemp software - global historical climate network
- temperature series per area/time, e.g. England temp provided from 16th century forward
- dataset is invaluable for making decisions given political climate
- can download dataset as well as per machine data
- people don't want to believe it
- 2007 code was published: FORTRAN 77 - 8000 lines of it!
- established clear climate code goals: produce clear climate science software, encourage production of the software, increase public confidence in climate science results
- retain all data -> refactor as Python ->
- Ergo, If your processing (alongside your data) is not published then you can't prove it.
[Looked at the code - seems like the problem with the Fortran code is it isn't abstracted enough. The Python code seems to be a rewrite rather than a better abstraction. Basically, abstract away the lower level of the calculation. -Tom]
OpenStreetMap,
Emilie Laffray,
OpenStreetMap
start:
OpenStreetMap
Speaker
Emilie Laffray
OpenStreetMap
Soundbites
OSM
Wiki style mass collaboration
Large number of people
Small contributions
Vector System
Haiti earthquake
Pre imagery
Yahoo
Post Imagery
GeoEye post quake imagery within 24 hours
Process
Tagged destroyed buildings
Mapping
Putting mapping into HandHelds
Uses
Routing around obstacles
Timeline
48 hours after
reasonably detailed map
Produced best map of Haiti
Future
Creation of Humanitarian OSM Team (HOT)
Improving
Co-ordination
Tools
Particularly editing tools for novice users
TOOLS
Centralized and Distribute Revisioning of Data based on the CKAN experience, Rufus Pollock and John Bywater
Large scale data handling and revisioning: Experience from the Genome, Tim Hubbard, Wellcome Sanger Centre
Can ScraperWiki actually work?, Julian Todd and Aidan McGuire, Scraper Wiki
CIVIC INFORMATION
The Straight Choice, Richard Pope
What you say is who you are: How open government data facilitates profiling politicians, Maarten Marx and Arjan Nusselder, Informatics Institute, University of Amsterdam
Election Data, Francis Irving, Edmund von der Burg, mySociety, YourNextMP, Democracy Club
Where Does My Money Go?, Panel from Project Team
OPEN GOVERNMENT DATA AND PSI IN THE EU
Open Government Data in Norway, Olav Anders Øvrebø, University of Bergen
Open Government Data in Germany, Daniel Dietrich, OKF Germany/Open Data Network
Access or re-use of PSI? A cookie if you get it right!, Katleen Janssen, ICRI/K.U.Leuven
PSI and Open Government Data in Europe, Chris Corbin, ePSIplatform
Open Law and Democracy experiences in France (short talk), Benjamin Ooghe and Tangui Morlier, Regards Citoyens
OPEN DATA IN INTERNATIONAL DEVELOPMENT
Open Development Data, Panel discussion
WIKIMEDIA UK
Wikimedia and Education.... the road not taken?, Jan-Bart de Vreede, Wikimedia Foundation Trustee
Tropenmuseum and Wikimedia, Hay Kranen, Wikimedia Nederlands
Wikimedia Nederlands Library project, Jose Spierts, Wikimedia Nederlands
Wikimedia and beyond: The future, Panel Discussion: Jan-Bart de Vreede, Joseph Seddon, Jose Spriets, Mathias Schindler (TBC)