LTER Data Managers Workshop - 7/29/95 A. Formalization of committee structure -Briggs 1. will be typing up copies 2. changes since yesterday a) does not think it is possible to restrict based on different cohorts -- gets too complicated b) can do elections to Data Task via email c) terms run January to December d) formalization of standing committees 1) Data Task will establish 2) will exist as long as directed by Data Task e) expected that decisions and actions of standing committees will be reviewed by Data Task who will then notify all the DMs f) switch to 3 year terms to eliminate conflicting elections g) may need to have special elections if member has to leave h) chair of DataTask will run elections - votes tallyed at NET 3. discussion a) DMAN list - go to one contact at each site 1) but need backup when away 2) should be people who come to meetings b) need to review how things are working next year c) want proper balance between ability to act and group notification and concensus d) what are current standing committees? 1) DMC - Data management committee 2) Data Management Task Force - 6 members a> role is to respond quickly to requests from outside b> can establish other committees 3) Data Management Meeting Organization Committee 4) probable other committees - ad hoc - working groups a> Network Office Committee 1> help with NET proposal b> Information System Committee e) full LTER committee structure is online on the WWW server f) in future - nominations should be done via email similar to CC B. Metadata Standards - FLED committee, ESA -Kirchner 1. FLED Committee 2. had meeting at Jones Research Center to develop document a) to discribe history on metadata issues b) set up outline for paper 3. subsequent meeting in San Diego a) but don't know status from that 4. discussion a) FGDC-based model at NBS for metadata 1) John Helly will discuss b) was our work included in the document 1) yes c) Metadata is one of several FLED subgroups d) Symposium Monday Morning will have FLED papers e) should we look to FLED for guidance? 1) more summary of what is out there 2) working out layered approach to metadata 3) will incorporate lots from LTER and others f) should circulate FLED report for review by DM g) lots of other issues considered by FLED 1) orphan data 2) lots of others C. Bibliographic Database - Chinn 1. table may be out of date a) lots of upgrades 2. should we move to distributed bibliography where each site maintains its own bibliography? a) stand alone WAIS servers b) SQL-queriable server 1) would be at each site? - Yes 2) RN- major technical challenge a> work on setting up in getting less and less b> increasingly moving to PCs - running on NT c> 5 years down the road expect things to be different - majority of sites will have SQL server..... d> variety of software on desktop machines is much more varied so we need to use desktop clients 3) BB- awfully high-end c) what about WWW tool? d) suggest maintaining centralized system and slowly pull in sites as they can set up their own 1) centralized WAIS server e) updates 1) is it awkward now? 2) can download and update 3. want a solution that is at least as powerful as the centralized 4. advantages of decentralized a) updates b) will encourage DMs to improve site software c) testbed for distributed database 5. disadvantages a) less even maintenance 6. TK-Good goal, but there are specific things we do with our database that go beyond simple WAIS - would not want to step back a) will this be queriable over net? b) not subsititute for local site bibliography 7. JH- two opportunities here a) PRO-CITE may be interested in collaborating in developing improvements 1) network accessibility b) we are not communicating with them well c) duplicate data AREN'T 1) if you ever filter data you lose something 2) time base rough and uneven - get different answers at different places a> need consistent -- hurts our credibility to have different answers 8. SS- we are a little fish in the marketing world! a) need realistic expectations in the power we wield b) has been roadblock in past 9. HC- if it is useful to us, may be useful to other biologists D. Centralized list of online data - Nottrott 1. datasets online have leapfrogged catalog 2. many more online datasets than in core dataset catalog 3. not consistent presentation of data and files at each site 4. JP- Need to generate content standard for catalog entries 5. what do individual site data catalogs look like? 6. need to have consistant internal structure within own site 7. could write some filters to take LTER online metadata and convert to catalog entry 8. needs to be in information system working group E. Outreach - Stafford 1. lots of things listed in Site Bytes 2. are there ways to be more effective a) need to target in priority order what groups we outreach to 3. activities ongoing in global outreach a) China b) Organization of Tropical Studies 1) several DMs have visited to look at DM c) Taiwan 4. training materials would be helpful for non-LTER sites a) bringing LMERs up to speed 5. responsibility in training people in managing data 6. criteria for site data management is form of outreach 7. activities with San Diego Supercomputing Center a) John Helly - went to Czhek Republic 1) spoke to about opportunities b) new executive director Peter Arzberger 1) his vision that computational biology they should think about SDSC 8. another book - form of outreach 9. need to decide on what should be our focus - time (not $$) is limited 10. internal outreach a) want to dispel "clublike" impression b) do need to communicate and help out other sites c) internal visibility issues 11. suggest having working group look at outreach a) need to look at boundaries of how far we can be stretched b) international work requires substantial resources 12. comments a) JB- Hungary 1) very much starved for information 2) interested in coming over for training 3) would be really great to have training facilities 4) what I did wrong: talked GIS and remote sensing a> got confused with regular data managment b> need to keep focus on data management 5) keep it as simple as possible a> focus on philosophy b) RI- Training for CERN 1) they are REALLY into it - eager students 2) preparation is useful to you as well as them! c) Gosz- intial stages of planning for visit to Spain and Moraco F. Data Accessibility - Ingersoll 1. table describing online data 2. updating table requires scientific expertise a) Mike Hartman will be taking on... 1) but would appreciate volunteer... 3. available on LTERNET gopher and WWW 4. Proposed Strategy fo Maintenance of Online Datasets a) objectives 1) current a> needs to be more up to date 2) accurate a> can't be PROPOSED entries 3) low-maintenance a> last round took three person days! 4) simple (i.e., general) 5. Strategy a) 1 person, but no more than 2 responsible for table b) 2 types of modifications 1) site entries to pre-existing categories in table body a> approval by dman not required but gopher location or URLs MUST be supplied byt the site info. manager so they can be verified by the table maintainer b> responsibility of site information managers c> information mangers should make a routine part of protocols for making data available 2) substantive modifications to body of table and or metadata a> would include major reworking of table categories and/or major rewording of table metadata b> prototype of "new" table made available online at LTERNET in obscure location 1> dman two weeks for suggested revisions 6. comments a) relation to data catalog 1) at higher level 2) subject headings very helpful 3) good start on keywords 4) need online form for submissions b) issue on whether Gophers should be maintained 1) all technology evolves c) table is very useful G. Data Publication 1. summary from last year's working group report 2. need data journal - peer reviewed a) further action depends on individuals stepping forward and funding or staff 3. anything else - FLED? a) some professional societies are addressing 1) Henshaw has paper describing in J. Geophysical Research b) park service will be wiring scientists to c) good time for recommending ESA standards for data d) Accession Numbers 1) no additional developments e) what about CDROM? 1) several sites are planning on publishing some CDROMs 2) bibliography may be on CDROM 3) problem - can't correct errors 4) advantage in publication medium a> forces discipline b> closure! c> press time is strong motivation d> some publishers publish updates H. LTER Data Management Stategic Vision 1. bullets followed by paragraphs 2. statement of purpose a) promote ecological science by fostering synergy between ecological and information sciences 3. bullets a) everything done in context of site/network/global levels b) info. management takes place in context of scientific research 1) integral part of research platform 2) glue that holds network together c) data needs to be transformed in a timely manner into information 1) quality control 2) information interface 3) products d) availability in long-term time frame e) research on information management per se 1) anticipate advances 2) standards 3) publications f) human resources 1) training 2) publications g) conclusion 1) dynamic, working document 4. next step a) small working group after this meeting to expand on bullets 5. comments a) problem with bullets - order is VERY important 1) give thought to ordering 2) some discussion of exchanging first two bullets b) who is audience? 1) would be very valuable for network office proposal 2) internal to start, secondarily to external 3) naieve to think this will stay internal -- things get out RAPIDLY 4) NSF will be very important audience a> will be part of report to NSF and that MAKES it public c) would make good first thing to be seen if we have data management Icon on home page d) define terms in paragraph expanding bullets e) tried to stay away from specific promises or tools f) responsibility for outreach 1) will be under human resources 6. need working group to finish up a) primary contact - Susan with major support from Barbara b) people in group + Susan c) needed SOON - draft to group by 15 August 1) completed by 31 August I. Site Review Guidelines for Data Management Working Group 1. Group Members a) Melendez b) Nolen c) Hayden d) Briggs e) Porter f) Tomeck 2. purpose - help with site reviews, give advice to new sites 3. Critical areas a) CC: mandates 1) guidelines for data policies 2) online data sets 3) archival storage system in place b) administrative structure 1) are data managers PIs? if not.. 2) is there an appropriate management structure - a> relationship between data management and scientists b> philosophy of management c> how long does it take to get a dataset accessible 1> policy in place? 3) long-term stability - fallback if DM leaves or dies need policy 4) are appropriate technical resources in place - MSI c) are there areas of excellence or deficiency 1) network participation a> could mandate participation at LTER DM meeting 2) datasets - are they being collected archived 3) support of the scientific enterprise 4) innovation 5) training 4. Increasing visibility for electronic reviews a) DM Vita showing network and international activities 1) publications 2) should be online so can be accessed prior to review b) online site capabilities - could incorporate progress reports 1) periodic review - progress report on data management would be helpful c) suggest data management Icon on net office web page 1) include info on committees, documents etc. 5. Document for Advising Site Reviewers a) condensed vision emphasizing critical role of Information Management 1) underlying philosophy b) copy of guidelines site management policies c) list of critical things d) minimum functionality - more specific e) no SPECIFIC way something must be done.... focus on functions 6. Species list query a) some activities sweep across network -- DM plays important roles b) would be good to have approval from executive committee or CC of queries requiring substantial work at each site 1) needs to be two way street -- communication with DM 7. Discussion a) need to find way to raise importance of network-level participation 1) partner in network b) attraction of DM group is enthusiasm of group 1) want to increase activity 2) voluntary / peer-pressure 3) may not be a good idea to impose network obligation 8. Discussion a) what next 1) writing - fleshing out document 2) under DM Icon on LTERNET server a> establish Icon on LTERNET to support review process 1> access to network pages b> would also help get us some of the credit for what we do b) when sites put up pages to deliver information to site reviewers, then sites know what they need to do c) Gosz - reasonable to have DM Icon 1) remember you are responsible for content 2) strategic vision 3) review criteria 4) more d) Susan and Jim should play active role in final version of report to add verbs that will make it usable e) does this meet need for LMER and new sites? 1) would this document do? 2) need documentation on what it takes to do full time DM 3) this document may not be ready in time f) primary contact 1) Barbara 2) other folks in working group 3) Gil Calabria g) Timeline - needed by October CC meeting 1) also critical for January Exec. Meeting 2) if sites won't establish minimum, NSF WILL!!!! 3) need first draft by end of August J. Gosz - timeline on network office proposal 1. anticipate deadline around January with panel and site visits with award around September a) overlap between current network office award is due to possible need for transition 1) some money freed up during first year by overlap a> need innovative things to do -- opportunity fund 2) servers 3) workshops 4) new technologies K. Outreach - focus on input to Network Office 1. Network Office timeline is important 2. sustained money for support of DM workshops a) includes support for some small meetings between meetings 3. need for Workshops - hard to anticipate 6 years in advance a) targetted list of workshops that span 1-2 year period b) others will also be proposing workshops 1) CC will make choices and decisions c) suggestions: 1) Planning grant for RTG - research training grant a> RTGs preproposal due in October 1995 is timing problem b> expertise around table would strengthen RTG c> can be inter-institutional d> need very strong justification - tie in to pockets of expertise 2) providing support for expertise and/or equipment at network office so it can be mainstreamed a> special agents/scouts b> dedicated server to run information system 3) workshops need to have network development focus a> site specific expertise b> but value at network level c> outreach can't be domain specific -- those should use traditional routes d> possible workshops 1> regionalization efforts a: technology transfer within region to gov. etc. b: we are not hoarding expertise 2> developing partnerships with SDSC a: hardware b: expertise 1: 3-D models - can create physical models 2: scientific visualization 3> field station network a: could export well-tested system b: like "connectivity pack" 4> planning workshops for information system 5> international a: it can be extremely demanding -- money for travel is NOT problem b: time committment c: need to build on top of existing infrastructure d: nodes, not first order sites - satellite locations 6> value added, rather than long-term sustained "sinks" for funding 4. comments a) RTG brings up issue of education 1) will be very surprised if Net Office RFP will not include education component b) field stations workshop is way to spread expertise c) graduate students - saw value of DM d) earlier spoke about internal outreach 1) inservices could be openned up to others e) important to take augmented sites and confronting issues about incorporating social sciences f) suggest workshop for Cross-Site Grants to look at developing model for relating site DM to cross-site DM g) JVC outreach to NASA has been very helpful 1) could have workshops to help fully exploit data h) mailing list for DM95 - includes people at this meeting 5. need very fast turnaround on this!!! L. Information Management 1. Part I - Hastings a) designing information system for next decade of LTER 1) long term vision 2) short term action 3) difficult - like nailing jello to a tree! 2. Part II - Nottrott a) What is it? - Vision 1) distributed system 2) naturally includes catalog & bibliography 3) complex query and display tools a> need to be built to aid specific tasks b) How to get there? 1) develop in modular fashion a> step-by-step approach 2) build on present functionality a> don't want to reinvent existing parts "for sake of distribution" unless it improves functionality 3) catalog and biblio good start 4) mechanism - 1 or more workshops to develop vision and start development of prototype a> site scientists b> computer scientists 3. comments a) what, when, where 1) not much discussed b) should not just rush out and start c) balance planning and implementation 1) are there some initial steps that can be taken right now 2) things that will stand us in good stead regardless of initial steps d) need to focus on WHAT, not HOW e) "pent up demand" f) suggestions on funding workshops 1) network office would have lag 2) DBA proposal would be helpful 3) could dedicate next DM workshop to very intense session a> could reschedule meeting g) is it monolythic or breakdown and focus on parts 1) it will it be implemented in pieces 2) things to think about in working groups h) is there agreement there should be a system? i) agreement on modular approach? 1) agreement on approach, but modules need to be defined a> need to address architectural issues - takes a long time b> need prototyping efforts, but not shots in the dark 2) need to define functionality needs to be first step a> means involving users 3) what are steps involved in getting architectural design? a> needs both information managers and domain scientists b> need to have workshop with domain and computer scientists with focus on specific problem c> could do workshop in context with non-LTER site d> could pull in folks from other project and see what pitfalls they encountered 4) could look at support for intersite project - lots of domain scientists j) may be good chance to bring in PIs -- what is scientific question driving it? k) would like to get good handle on question of what is driving need for information system! 1) cross site efforts present both obsticles and opportunities l) would prefer not to focus on ONE project, but rather focus on a number of individuals from cross site projects m) workshop 1) cross-site people 2) computer scientists 3) how big? 4) how soon? n) would be valuable to have sustained effort on this over course of next year 1) 3 workshops - one-LTER, then two LTERs then all 18 LTERs a> open invitiation to attend from all LTERs o) need to get specific - real PIs with real problems p) general agreement on need for information 1) need input from PIs 4. charge a) workshops needed b) pre-workshop activities 1) develop list of possible attendees c) chart long-term plan d) criteria for success of workshop M. Working Group Reports 1. group 1 a) goal to be able to elucidate what people really want b) user need concensus would indicate success 1) might take multiple workshops c) how do you get information from people? 1) information managers put together framework and PIs respond 2) could react to existing information systems a> difficulties b> blocks to syntheses c> prototype system interactions 1> they know what they like! d) pre workshop activity 1) survey PIs -- but some had very negative reactions a> but information manager at each site can do informal query b> would get much more positive response 1:1 2) would be good to bring IM's back as group to compare notes on needs etc. a> could do as small regional working groups 1> possibly teleconferencing e) focus on research area 1) react with prototype around scientific questions f) computer literate PIs would be good to invite g) try to set up planning with idea that there may be future workshops --- with short time-frame h) comments 1) like idea of talking to PIs a> graduate students would also be good 2) make sure PI buys lunch!!!! 2. group 2 a) charge: to facilitate LTER science 1) science addresses specific questions 2) info system must be able to answer questions in paradigmatic framework a> works at site - 50 scientists b> now expanding to 1400 scientists b) need to identify the types of questions the system must address 1) intersite projects are asking these questions 2) must also deal with PRODUCTS from intersite projects a> DMs from sites in cross sites, should participate c) next focus on level of complexity needed 1) assess existing systems d) workshop 1) PIs with cross site interests 2) IMs 3) computer experts a> eventually include people from other agencies e) comments 1) possibility of agency people in first round 3. group 3 a) workshop(s) 1) mini-workshops at individual sites a> focus on current capabilities b> could fill out questionaire after c> develop common outline across sites for format of workshop 1> including range of recommended demos d> should be very simple e> possibly done at subset of sites or with varying formats to meet local needs 2) workshop a> objective: focus on user needs for cross-site/network research b> who should attend workshop? 1> PIs & Grad students interested in doing cross-site work a: send general request for volunteers b: minimum 1 from each site 2> IMs 3> potential collaborators in information science c> content 1> demos - LTER and others 2> identification of specific questions you would like to see answered 3) workshop a> objective: short and long-term plan for system development 1> short term - clear cut steps 2> long term view b> who should attend workshop? 1> limited number of PIs from first workshop b) pre-workshop 1) questionaire a> what are your sites current capabilities? b> what services are most important and useful now? c> what have researchers tried to do and been frustrated? d> what functional additions are most critical? e> send to ALL in LTER! 1> wide distribution 2) meet with own PIs to discuss 3) query about interest in PIs c) alternative to workshops 1) do lots of questionaires, iteratively d) criteria for success 1) written plan of action 4. Group 4 a) focus on visionary PIs 1) developed list 2) outreach to other LTERs b) developed list 1) brainstorming workshop 2) would put in proposal for spring of 1996 c) need to get information up front 1) approach with same sets of questions 2) work at local PI meetings 3) then 1:1 d) comments 1) demonstrations of functionality important a> network accessible 2) graduate students must be included 5. comments a) approach depends on site 1) different levels of expertise b) PI-grad student teams 1) take questions 2) apply to own and other sites 6. ad hoc committee appointed N. Brief overview of Hastings NSF project 1. NSF proposal to DBA a) not thinking too large - focus on McMurdo b) joint proposal with Wharton and some external organizations to begin prototyping laboratory information system for McMurdo 1) BIOSPHERE II - Oracle-based laboratory system 2) reference system 3) link to GIS a> ESRI c) partially funded 1) original - 3 workshops with prototyping periods in between a> 780K over 2 years b> lots of participant support costs 2) concerns a> is this the right technology? 1> seemed predetermined b> where is the science? c> centered only on McMurdo -- how about involving others? 1> LTER involvement d) NSF funding first workshop to answer those questions e) hard to deal with these issues 1) want to use as springboard for LTER overall 2. planned workshop a) workshop to plan prototype involving GIS, antarctica 1) hard to get all pulled together in short time b) organizers 1) Baker 2) Hastings 3) Calkins - Associate Director NCGIA c) sponsors 1) Greenland 2) Wharton & Gross 3) Stafford 3. would appreciate help from group a) how can we use concensus of support from NSF 4. workplan - a) preparatory activities b) timetable 5. comment a) is this different from network-wide system? 1) is only a string of workshops 2) lots is not covered here - catalogs, bibliographies 3) only focus is weather and GIS b) why late involvement by DM group? 1) were not involved in original proposal 2) did not receive much response from data task O. Final tasks 1. vote on internal structure 2. next year's meeting - committee a) Briggs 3.