-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom regular report for Stanford Medicine Center for Improvement (SMCI) #1201
Comments
Various options considered below. Option A: Modify the sul_pub codebase to add this report Modify the existing sul_pub system to add the new users so they can be harvested. Either create a custom report whose results can be output on demand via a new API call or expect the Profiles team to pull all the publications needed by user and then run the aggregate report on their end. Assuming we cannot add the additional authors not already in the profiles system to the author's table in some way (i.e. by assigning them special/unique cap_profile_ids), this option is not advisable due to the extensive modifications needed in a production system. Challenges:
Rough estimates: 2-3 weeks of a couple developers, likely introducing new bugs in the process Option B: Create a new system Create a new codebase with a new database that contains only the authors of interest (could be in profiles or outside of profiles). Either create a custom report whose results can be output on demand via a new API call or expect the Profiles team to pull all the publications needed by user via an API call and then run the aggregate report on their end. This option has the potential to be time consuming from a development standpoint. Challenges:
Rough estimates: 3-4 weeks of a couple developers, plus ongoing maintenance and support costs Option C: Find a solution using existing tools Do not modify sul_pub or create a new system, but instead investigate the user of existing tools we have access to, such as the Web of Science UI, the Dimensions UI, etc. A staff member would need to periodically run a number of queries and then manually aggregate results into a report. Challenges:
Rough estimate: 1 week or so of a developer helping another non-developer (e.g. Jacob) find a solution, plus ongoing time cost for this other staff member to run the reports Option D: Write a scripty solution This is a combination of options B and C. It takes a coding approach to manual reports, to minimizing the repetitiveness of running reports over and over again, while not going as far as creating a whole new system accessible via APIs. For example, a ruby script could import a CSV file of authors and make use of the Dimensions or Web of Science API to query and then produce lists of publications for the aggregate reporting. We could return a bit list of publications to the Profiles team which they could use for reporting, or we could add development time to produce the report ourselves. It still requires software development time, but less than would be required for a full fledged system available via API calls. It could be run from a QA server or even a developer laptop to minimize demands on OPs. It also still requires ongoing support but maintenance, but less than would be required for a full system. It also requires someone technical enough to run a script on a regular basis (though this person doesn't necessarily need to be a software developer). Rough estimate: 2 weeks or so of a developer helping another non-developer (e.g. Jacob) script a solution, plus ongoing time cost for this other staff member to run the scripts Option E: Use ORCID/CrossRef/other systems This makes use of an ORCID/CrossRef connection to have authors keep their ORCID profiles up to date. This essentially becomes a training/support task to help authors get ORCID profiles setup and populated. It may also require development to produce a report from all of the ORCID Profiles. Challenges:
Rough estimate: A few days of a Jacob or Peter working with other staff to find the solution and help with training Summary Options A & B are not advisable for various reasons. Option C is the least intrusive to the software development teams. Option D may be the most practical if we choose to take this on. Option E is the best path to encourage ORCID option but the least likely to result in an actual report in a reasonable time frame. |
Additional questions for Tina:
|
More possible ideas:
OR
OR even easier by name and institution directly
So we could write a rake task to run two sets of queries, once against our local database for those with Profile IDs for approved publications and once against WoS without profile ideas, merging the results. This may be faster since we have the same pub_hash structure in each case ready to go. |
See #1204 for a custom class/rake task that does the report. |
Here is an example of a dashboard made using Dimensions / Google BigQuery. If it meets the reporting requirements it could be the least intrusive to the development team and could provide additional views into the data (I'm not sure if those would be useful). The downside is the added cost. |
Those dashboards in GBQ are definitely nifty, and I think are a nice solution for some of the custom requests we've field in the RIALTO world. I'm not 100% sure it fulfills this particular request though, as I believe they may be using the result of this report simply to import publication into a website, so not sure they benefit from the nice UI in the report and instead just need a CSV. The other challenge is that they need approved publications from sul_pub for authors that have profiles in addition to this second set of authors that are not in profiles - so there are two separate data sources. The sul_pub data either requires direct database queries on the server, or using our own internal API, which can return the publications as JSON. |
Consider producing a custom report on a regular basis that includes both publications for users currently in profiles, as well as users not currently in profiles.
Most likely it will be a once a month report. They need to have a citation or be able to build the citation they want. Therefore, we were going to provide them with the most recent approved publications (from the last date received) with the following:
Additionally, we wanted to provide them with the separated fields that form the citation in the case they want to do their own formatting for the citation display on their website. So that would be, for example:
This would also need to include the author's details, when available:
The thought is that possibly it would make sense, if you can pull the information for those without profiles, that you produce the full report for everyone that also includes the data for those with profiles (thus some of the fields noted above), so that the report is coming from one source and combined. The number of researchers is from 94 - 130 from the current information we have. Out of these there were around 33 that did not have profiles or active profiles and there a number that do not have full profiles that have the publication import option. Also, we have staff members in the list who would also not have the publication import turned on by default, as this is only on by default for faculty and postdocs.
The text was updated successfully, but these errors were encountered: