This week's assignment we are given a mock up of patients' frequency visiting the hospital(Freq), their blood pressure(BP), and three different doctors rating on the patients condition. The first doctor is a general doctor and is simply stating "bad" or "good", and the other two doctors are external doctors rating the patients condition based on decision regarding immediate care( low or high). We are told to give these rating numerical representation based on either 1 or 0. (bad = 0, good =1) and low = 0 ,high = 1)
Variables are : Freq, BP, "First", "second", "final".
1. "0.6","103","bad","low","low”
2. "0.3","87","bad","low","high”
3. "0.4","32","bad","high","low”
4. "0.4","42","bad","high","high"
5. "0.2","59","good","low","low”
6. "0.6","109","good","low","high”
7. "0.3","78","good","high","low”
8. "0.4","205","good","high","high”
9. "0.9","135",”NA","high","high"
10. "0.2","176",”bad","high","high”
Our main goals this week is to make a side-by-side boxplot and a histogram of the data presented to us.
We are also expected to discuss whatever results we were able to attain from the data.
Let us begin:
First we need to convert each variable to a vector:
>Freq <- c(0.6,0.3,0.4,0.4,0.2,0.6,0.3,0.4,0.9,0.2)
>BP <- c(103,87,32,42,59,109,78,205,135,176)
>First <- c(1,1,1,1,0,0,0,0,NA,1)
>Second <- c(0,0,1,1,0,0,1,1,1,1)
>Final <- c(0,1,0,1,0,1,0,1,1,1)
From here I decided to use what we learned last session and organized all of our data in a data frame:
>docdf <- data.frame(Freq,BP,First,Second,Final,stringsAsFactors = FALSE)
Now that we have our data organize we can begin to decide how we want to analyze the data.
I decided to focus on how the patients' Frequency of hospital visits and Blood Pressure relate to the how all three doctors rated the patients condition.
What I will do is take the three doctor's ratings (1 or 0) and determine the majority rating based on which rating is more favored. Therefore any values greater than 1(>1/3) would represent instances when the doctors overall felt the patient was at some concern and values less than 1 would mean the doctors were not critically concerned.
Here is the function I came up with to represent these data as a side-by-side boxplot:
# This function function will create a Boxplot based on MDs' rating #of either Freq or Bp based on colm number
> plotBox <- function(df,colm){
+ if (colm !=1 && colm!=2){return("Please pick either colm 1 or 2")} # A Check to make sure users enter either 1 or 2 for colm
+ docs <-vector()
+ zeros = vector()
+ ones = vector()
+ for(i in 1:nrow(df)){
+ docs<- c(docs,(sum(df[i,3:5],na.rm = TRUE)))
+ if(docs[i] >1){ones <- c(ones,df[i,colm])} else{zeros <- c(zeros,df[i,colm])}
}
+ if(colm ==1){return(
+ boxplot(ones,zeros,
+ main= "Boxplot of frequency values based on overall MDs' rating",
+ names= c("Concerned","Unconcerned"),
+ ylab ="Frequency of hospital visits in a 12 month period"))
+ }
+ else if (colm==2){return(
+ boxplot(ones,zeros,
+ main= "Boxplot of BP values based on overall MDs' rating",
+ names= c("Concerned","Unconcerned"),
+ ylab ="BP Values"))
+ }
First in our function we state the arguments that will be used. To keep things modular we have the df argument for the data frame and colm represents either Freq or BP, 1 or 2 respectively.
I added an instant return message if the user entered a value not 1 or 2.
(Note: originally I had two different function that did Freq or BP plotbox, but since both code were largely a repeat except for a few variables I decided to combine the two to reduce the size of the overall code and instead used arguments an extra argument to switch between Freq and BP)
Next I have three empty vector variables.
docs will house the consensus rating of the three doctors that will be obtained from a for loop (more later about the loop later)
zeros and ones will look at the consensus rating of the doctors and will add the corresponding BP/Freq to zeros if more doctors rated 0 and ones if more doctors rated 1. This will all be further explained in the for loop section next.
In our for loop, we are going through each row, since Data frame must have equal number of rows we it will loop through each row and determine the overall consensus of the doctors and assign the BP/Freq to the respective category . The first line of the for loop will count up all the 1s between the doctors. Note that we needed to add na.rm to the sum function to ignore the NA that was in the data for the First doctor.
+ docs<- c(docs,(sum(df[i,3:5],na.rm = TRUE)))
The next line looks at the sum and determines if that value belongs in zeros or ones
if(docs[i] >1){ones <- c(ones,df[i,colm])} else{zeros <- c(zeros,df[i,colm])}
Finally the last part of the function checks if we are focusing on the patient frequency to the hospital or their BP and return a labeled Boxplot accordingly.
Here are the plots:
From our line:
>plotBox(docdf,1)
and our line:
>plotBox(docdf,2)
For the histogram I simply just ran a histogram function on our Freq vector to get an idea of how often patients are coming to the hospital visually.
The arguments for this function is mostly the same as plotbox but I added one more argument oz that is either 1 or 0, this return the mean of either concerned patients = 1 or unconcerned patients = 0.
The the main difference from plotbox is mostly in the last two lines, instead of returning a boxplot we will return a mean:
Comments
Post a Comment