Project: Develop an Android mobile application to show patient friendly costs of care

muarachmann · July 4, 2020, 1:37pm

On testing the scrapper, I ran into a few issues that I could resolve but you should consider handling them

The chromium.exe is default windows and wont work on my system (MacOS) havent tried for Linux but I guess it won’t too… It kept on giving

 start
    os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver.exe' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home

Also consider not adding this to the repo, write a detailed readMe as discussed and ask the user to download the suitable versions compatible with the selenium driver you are using - Chrome/80.0.3987.100

The scrap data is heavy i have csv files of upto 148MB which ideally I shouldn’t be pulling this to test your MR, I just need the link(urls) and test against the code to see if it works on my end, merging to the master branch will handle downloads of the data. We have discussed this severally!
Consider adding a detailed README!. If you need help with the Mac OS installation guide I can chime in since you are working on windows I presume.
Always check if directory/files exists or create them if not before writing data to them

Darshpreet2000 · July 8, 2020, 9:56am

@muarachmann @r0bby @sunbiz @judywawira

I have learned to use pyQT , I can make GUI of Scraper with it

I have created this screen which generates process.py file which I discussed with @muarachmann on our last video call.

I will make other GUI Screens by which we can scrap data easily

If you have any suggestion on any other framework other than pyQT then please tell

Darshpreet2000 · July 10, 2020, 6:59pm

Link to week 6 Blog

muarachmann · July 13, 2020, 11:30am

@Darshpreet2000 this is GREAT, The repo is looking much more cleaner and saner without the data/CMD, and also the selenium works cool. I love it! The readme needs a lot of improvements still, don’t forget to document every step and also troubleshooting is also important.

You may want to add this to the troubleshooting for macOS users to allow permissions manually. Or they can install the driver using brew cask install chromedriver

Good job. As for the pipeline, I am sure it should be a permission issue. Maybe posting more errors on what you got, I may be of help. As for the GUI, that can happen after the GSoC or after the main objectives are attained here, keep processing the file (py) as you are doing currently. It wont hurt. I am rather thinking of an inbuilt REST server, maybe flask to test the points locally with the app. That will be discussed later for now focus on making the pipeline work and feel free to ping should you run into any errors.

Processing the files works but I get this error on the New york -

Saving
Traceback (most recent call last):
  File "Process CDM/Indiana/process.py", line 7, in <module>
    import pypyodbc
ModuleNotFoundError: No module named 'pypyodbc'
No Data Found For /Users/muarachmann/Documents/open_source/LibreHealth/lh-toolkit-cost-of-care-app-data-scraper/Data/New York

Guessing you probably forgot a pip freeze.

Looking forward to more changes & improvements

Darshpreet2000 · July 13, 2020, 11:43am

@muarachmann

Thankyou for testing my code,

I have fixed all those issues which you have addressed, I will soon make a commit after I will fix readme.md today.

Also I was able to install chromebrowser & chromedriver in linux pipeline & I am able to run pipeline successfully & it is working correctly.

r0bby · July 14, 2020, 4:00am

This is great! Keep up the good work.

Darshpreet2000 · July 17, 2020, 6:46pm

Link to week 7 Blog

Created Progress Indicator which shows progress while downloading & inserting CDM to SQL Database in App

Darshpreet2000 · July 20, 2020, 11:23am

https://www.medicare.gov/hospitalcompare/search.html

This is Medicare Hospital Compare Website, It uses data from

https://data.medicare.gov/data/hospital-compare data.medicare.gov

I have completed main functionality (Comparing CDM) of app , I am planning to add this Hospital Compare Feature to app.

Please let me know if you have any suggestions??

@muarachmann @r0bby @sunbiz @judywawira

Darshpreet2000 · July 24, 2020, 7:10pm

Link to week 8 Blog

I got data for all States of US from Medicare Data, Please do let me know if I need to scrap more or Should I work on GUI of scraper ?

User can view Data of any state from app

App gets Data from GitLab API from this repository

Link to Release Android App, I highly encourage you to please try this & give your suggestions

Currently I am working on compare hospitals screen ,I have developed UI of it.

@muarachmann @r0bby @sunbiz @judywawira

r0bby · July 25, 2020, 3:30am

Great work! Keep it up!

Darshpreet2000 · July 25, 2020, 4:46am

Thank you very much @r0bby

@muarachmann

I have completed pipeline & scraper work, We can now discuss about the REST server with Flask needed for app, I am ready to make it. Please give its detail

sunbiz · July 25, 2020, 5:34pm

I don’t think you need a complex GUI for the scraper. More important to focus on the mobile app, and building it with unit tests, so that it can be releasable into the iOS and Google Play Stores at the end of the summer.

This looks great… Nice work!!

r0bby · July 27, 2020, 1:37am

I am in complete aggreement, the idea is just to make the scraper work, we don’t need a complex GUI or even a GUI at all actually. I’ve written stuff to automate boring tasks, a GUI is almost never needed for it, or if it is, just make it a crude UI – just enough to get the work done, no need for snazziness. If it distracts from the main project, it’s useless as we need that data, but it’s not the project itself.

muarachmann · July 27, 2020, 12:55pm

@r0bby, @sunbiz I am going for an initial merge, the MRs are heavy and base for the projects have been laid properly. Subsequent reviews will be easier. Let me know your thoughts

r0bby · July 28, 2020, 12:47am

@muarachmann – ultimately – I’m the big picture guy making sure we get all the required evaluations done and that students are on-track. I 100% trust your judgement, do whatever you feel is right.

muarachmann · July 30, 2020, 11:22am

@Darshpreet2000 quick ones

Is the gitlab CI scheduled or manually ran
I see it is triggered only on build? I am about to make a PR to the develop branch from master will this pull data in there, did you consider the pipeline to only run on master ?

Darshpreet2000 · July 30, 2020, 11:25am

@muarachmann

I will make a PR to trigger pipeline now

You can make PR to develop, this pipeline will run on master

@r0bby @sunbiz please increase pipeline timeout to 2 hours in settings->CI/CD->General Pipelines

Darshpreet2000 · July 31, 2020, 5:47pm

Link to week 9 Blog

Darshpreet2000 · August 1, 2020, 3:15am

Thankyou for passing me in Second Evaluations. Actually you all are great pioneer for me because you all guided me for achieving this goal.

Thank you once again

@muarachmann @r0bby @sunbiz @judywawira

@r0bby Please help in setting CI Pipeline, I am getting Authentication Failed & Access Denied Error on Running pipeline in “test-pipeline” branch

Maybe there is problem with CI_TOKEN variable which stores Access Token

Gitlab ci.yaml file for “test-pipeline” branch

@r0bby Please check this, there is authentication failure when LibreBot pushes to repo after scraping CDM

I am getting this message

Authentication failed for ‘https://librebot:@gitlab.com/librehealth/toolkit/cost-of-care/lh-toolkit-cost-of-care-app-data-scraper.git/’

muarachmann · August 5, 2020, 12:52pm

@Darshpreet2000 free for a call tomorrow by 11AM GMT? I won’t be available this weekend. Have some things to discuss about the next steps