<p>I've managed to resolve this by collapsing the data and so don't need help resolving this anymore. </p>
Wed, 26 Jan 2022 14:27:52 GMT Tim Wraight
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77e6d30b8017e91b6fdc27e8c
<p>I am working with a dataset which has house prices in two countries, over the span of around 10 years. I am trying to plot a two way line graph of the monthly house prices in each country. However, when I input my code it joins every dot to every other dot of the country instead of just the dot either side. </p>
<p>This is weird as I have managed to achieve this before, however I made a couple changes to my dataset I was using (Added a couple extra years) and although I copy and pasted from my original Do file it now didn't work. </p>
<p>Thanks</p>
Tue, 25 Jan 2022 14:49:55 GMT Tim Wraight
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77e6d30b8017e83ba65f35dd5
<p>Hi,</p>
<p>The problem might (but I’m not 100% sure: different fields often mean widely different things by “fixed” and “random” effects – see a discussion here: https://stats.stackexchange.com/questions/4700/what-is-the-difference-between-fixed-effect-random-effect-and-mixed-effect-mode) come from the fact that the FE estimator does not estimate coefficients for time invariant covariates, while the RE estimator estimate them, so the RE estimator estimates more coefficients, which makes it impossible to match the pairs of coefficients obtained using the two procedures (see formula of the Hausman test statistic).</p>
<p>Not using “quietly” when performing the estimation to be able to actually see the output of your estimation could probably give you a hint of whether this is what is happening. You could also “list” the estimates, to see whether the number and name of entries in your two vectors of estimates are the same.</p>
<p>Based on the helpfile for the hausman test, it seems that you can use the option equations(1:1, 2:2, 3:4, etc) (for instance, here you would match the first coefficient of your first estimation with the first coefficient of your second estimation, etc, and third with four) to match the pairs of coefficients of the two estimation procedures. To make sure that you are pairing together the coefficients which indeed correspond to the same variable, you should however probably “list” the vectors of estimates. That way, you could essentially drop the equations corresponding to estimates of effects of time-invariant variables (if those are the source of the problem) and only perform the test using estimates of effects of time-varying ones.</p>
<p>That being said, I don’t know if this is econometrically the right thing to do, or if by doing this, you would essentially be comparing apples and oranges, leading to wrong inference (it again depends on the derivation of the distribution of the hausman test statistic).</p>
Sat, 22 Jan 2022 21:38:58 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87e6d3349017e7dfa86f46e7e
<p>Hi again,</p>
<p>For now, I've decided to use the command "hausman, force" that you mentioned. Thank you for that suggestion!</p>
<p>However, I ran the Hausman test and it says that there are "no coefficients in common", which seems strange because both models use the exact same variables:</p>
<p>. quietly feologit LIFESAT7 BAME FEMALE UNEMP MARRIED ib(1).EDUCATION AGE AGE2 c.INCOME POLICY POLNET, vce(cluster pidp)<br>. estimates store feforce<br>. quietly xtologit LIFESAT7 BAME FEMALE UNEMP MARRIED ib(1).EDUCATION AGE AGE2 c.INCOME POLICY POLNET, vce(cluster pidp)<br>. estimates store randforce<br>. hausman feforce randforce, force<br><em><strong>no coefficients in common; specify equations(matchlist)</strong></em><br><em><strong>for problems with different equation names.</strong></em></p>
<p><em></em>I've looked up "equations(matchlist)" but I'm struggling to figure out how to apply it here?</p>
Fri, 21 Jan 2022 18:51:17 GMT Jordan Hall
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841a7e6d334d017e7d6fc77a5ab0
<p>Hi,</p>
<p>Some elements of response:</p>
<ul>
<li>Based on the helpfile for the hausman test command, you can still run a hausman test despite clustered vce if you specify the option “force” when performing the test.</li>
<li>This being said, if the command normally issues an error, this is because the test assumes non-clustered standard errors, so doing this might be misleading (the actual distribution of the test statistic under the null might be different from the distribution assumed by the command). Whether this is the case is an econometric question that probably would require doing some research or discussing with your tutor if you’re not sure.</li>
<li>Although, based on the help file of feologit, the command always uses clustered standard errors (it makes sense in a panel setting, if you don’t allow for autocorrelations across observations of the same individual, you will probably overestimate the accuracy of your estimates), one way you might be able to get round it if you really want to would be to cluster the variables at the observation level (which should be equivalent to not clustering, although it might still allow for heteroskedasticity), i.e., for instance:</li>
</ul>
<p>gen id=_n</p>
<p>And then using cluster(id) as an option of feologit</p>
<p>Then, you still would need to use “force” when performing the hausman test, since Stata should still store e(vce)=cluster</p>
<p>Although the suggestions above should make it possible for you to obtain some results, whether those are adequate solutions or would lead to wrong inference is really an econometric question which depends on the assumptions on which the derivation of the asymptotic distribution of the test statistic relies, so I cannot really help in this respect.</p>
Fri, 21 Jan 2022 16:19:44 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87e6d3349017e7bfa13304aed
<p>Hi Margot</p>
<p>Thank you for your response.</p>
<p>reshape long *variables*, i(familyid personid) j(childnum) seems to have done the trick.</p>
<p>Thanks again</p>
Fri, 21 Jan 2022 09:31:33 GMT Owen Wallbanks
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77e6d30b8017e79ec857e4ddc
<p>Hello,</p>
<p>It seems that you forgot to upload the screenshot.</p>
<p>Maybe I could help after seeing the screenshot, but without, it is very difficult to understand the issue you are encountering.</p>
Thu, 20 Jan 2022 23:57:30 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87e6d3349017e79ea45144a02
<p>Hi, hope you're well!</p>
<p>I'm running an ordered logit with a 2-period panel data set (dependent variable = life satisfaction rating, coefficient of interest= DiD interaction between income and a dummy for observations after a tax was imposed). </p>
<p>The standard command is xtologit, which works completely fine. The slight issue is that this command assumes *random effects*. Only quite recently has a *fixed effects* ordered logit command been made, called feologit. I installed this command and it seems to work ok. However, whenever I use feologit, it has always used cluster-robust standard errors. I don't even tell it to do it, it just does it automatically: it says "(Std. Err. adjusted for 12,739 clusters in pidp)"</p>
<p>The problem is that when I try to do the Hausman test, it fails because the Hausman command can't be used with vce(cluster).</p>
<p>. quietly feologit LIFESAT3 fimngrs_dv POLICY POLFIM<br>. estimates store fix<br>. quietly xtologit LIFESAT3 fimngrs_dv POLICY POLFIM<br>. estimates store rand<br>. hausman fix rand<br><em><strong>hausman cannot be used with vce(robust), vce(cluster cvar), or p-weighted data</strong></em></p>
<p></p>
<p>Is there a way to remove the cluster-robust standard errors from the feologit command?</p>
<p>If not, could I just compute the Hausman test manually, as the sum of the squared differences in coefficients divided by the sum of the difference in variances? (I'm assuming any coeffcient that's time-invariant and excluded from the fixed effects model would just be set as 0?)</p>
Thu, 20 Jan 2022 23:55:03 GMT Jordan Hall
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841b7e6d30bd017e79e83d465e60
<p>Hello,</p>
<p>The command reshape long should probably achieve part of what you're trying to do. It automatically creates new identifiers.</p>
<p>Otherwise:</p>
<p> - tag(varlist)</p>
<p> - bysort followed by egen (can be used to "group" variables based on some common value of a variable - e.g. common identifier -, and then you can create some new variable equal to max, min, mean of some other variable, by group)</p>
<p> - using some "if" conditions</p>
<p>might also be useful, however, it is difficult to know without seeing how your dataset is structured.</p>
<p>I hope this helps!</p>
Thu, 20 Jan 2022 23:52:50 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841b7e6d30bd017e79d6ab945e5d
<p>Hi,</p>
<p>I am not sure why you are getting this.</p>
<p>However, a simple way to define by yourself which category is omitted would be to directly create dummies for countries, and then, simply listing all dummies, except the one you want to exclude, among your regressors.</p>
<p>Here, it would mean:</p>
<p>1) create a new variable country_post_2015=COUNTRY*post2015</p>
<p>2) tab country_post_2015, gen(country_post_2015_dum)</p>
<p>Which will generate a set of dummies.</p>
<p>I hope this helps!</p>
Thu, 20 Jan 2022 23:33:38 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77e6d30b8017e79be17e34dcd
<p>Hello,</p>
<p>With respect to the difference between Z and t-statistics, for N-k sufficiently large, they are equivalent, but Z statistics are only appropriate asymptotically, i.e. for N-k large. You can check a statistical table to verify whether, in the case of your regression, N-k is large enough to use Z statistics. Otherwise, repest stores e(b) and e(V), so you could also manually compute the t test statistics, but this probably would be cumbersome with a high risk of making some mistake.</p>
<p>With respect to ovtest, I think that the best solution is probably to run the test by yourself, without using the command ovtest, since the test essentially boils down to a regression. See here for an example: https://blog.ms-researchhub.com/2020/05/14/ramsey-reset-test-on-panel-data-using-stata/. All you need to do is to compute the fitted values after your regression, which you should be able to do since repest stores e(b). If predict does not work after you use repest (although it probably should work since e(b) is stored, according to the help file), this discussion might be helpful to find a way to compute the fitted values: https://statalist.org/forums/forum/general-stata-discussion/general/1362382-extract-coefficients-from-e-b-after-regress. It should be feasible to include weights when running the test by using standard options (since the test itself is a regression), or even maybe using repest to run the test to include the appropriate weights. However, I really don't know whether it would be a good idea and what the statistical properties of the test would be if you did this. This is probably something you should discuss with your tutor. A much more cumbersome alternative would be to understand what stored results the command ovtest uses, creating locals with the same name and containing the result your need, but since the ovtest is a simple linear regression, it is probably much simpler to run the test by yourself, and also safer than using a blackbox command which might interact strangely with your previous estimation.</p>
<p>I hope this helps!</p>
Thu, 20 Jan 2022 23:06:48 GMT Margot Belguise
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87e442ae4017e5d6bd23f67bc
<p>Hello!</p>
<p>The dataset I'm using contains information on sibling pairs (the respondents) and their children. My project focuses on the children of these sibling pares, but the issue is since their children are essentially variables linked to their parents ID, it makes it quite difficult run analysis on them. I was thinking of rearranging the data by creating an extended family identifier that links the sibling pairs and their children together, a nuclear family identifier that links children to their parents, and a personal ID for the children themselves. I would then also have some variable that identifies whether the individual is part of the sibling pair or a child. I'm not sure how to do this or if there's a better way to achieve the same outcome. </p>
<p>Many thanks for your help</p>
Sat, 15 Jan 2022 11:07:34 GMT Owen Wallbanks
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841a7e442ae8017e58794d232da4
<p>Hi,</p>
<p>I'm running some prelim analysis with a difference in difference model using the following code in Stata:</p>
<p> reg ed year ib(1).COUNTRY post2015 ib(1).COUNTRY ib(1).COUNTRY#i.post2015 SEX if age1718 & SEX>0</p>
<p>The default works as I expect for the country output (treating COUNTRY == 1 as the default), however I get the attached output for the interaction term.</p>
<p>Here, England corresponds to COUNTRY == 1, so I'm not sure why it isn't treating England as the default.</p>
<p>Do you have any ideas what I'm doing wrong and how I could fix this?</p>
<p>Thanks,</p>
<p>Matthew</p>
Fri, 14 Jan 2022 12:04:11 GMT Matthew Oulton
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841b7e442832017e4f6314de6c96
<p>To whom it may concern,</p>
<p>Thank you for your time.</p>
<p>I would appreciate some guidance regarding why Stata is outputting two results for TREATMENT#BLOCK1 ...3. I have attached a screenshot below to clarify the issue. </p>
<p>For context, I am only looking at the UK (no group effects) and am looking across 3 time periods. </p>
<p><img src=""></p>
<p>Thank you for your help,</p>
Wed, 12 Jan 2022 17:43:20 GMT Finn Clark
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841b7e442832017e4ea0112b679c
Wed, 12 Jan 2022 14:10:20 GMT Amogh Patil
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77db86903017e441f752c742c
<p>Hello,</p>
<p>My analysis involves using PISA survey data. OECD have strongly recommended using the repest command running a regression to account for plausible values and weights appropriately. However, the output of this regression looks a bit alien to me; z-statistics are given as supposed to t-statistics, doesn't show R-squared and also doesn't show number of observations. When I want to display these values it gives a blank response. I've attached a log file showing this. I was wondering if there was anyway to overcome this, or if this issue is unavoidable. It also won't let me do an ovtest, which is hopefully a big part of my analysis. </p>
<p>Thanks :) </p>
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a17841b7db86906017e101cb9c81ebd
Fri, 31 Dec 2021 10:50:25 GMT Beth Walton
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87db86bbf017e05cdb24a7408
<p>Hi,</p>
<p>I have been trying to carry out a Hausman test to find out if I should be using fixed or random effects for my panel data analysis. However, I have been getting a probability of 1 which I am assuming is wrong - I have tried numerous things suggested on forums, such as xtoverid and adding sigmamore in but none of them seem to work. Any advice would be much appreciated!</p>
<p>Kind regards,</p>
Wed, 29 Dec 2021 10:47:53 GMT Beth Walton
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d77db86903017ddcfeb55c59f1
<p>Hello, </p>
<p></p>
<p>I am trying to use the command geonear to calculate the distances between colleges and households that answered a survey. I have the coordinates of the households (longitude and latitude) saved in my survey dataset and a second data set with the coordinates and names of colleges (longitude and latitude). I wanted to use the geonear command to calculate the distance to the closest college for each household and identify the closest one. </p>
<p></p>
<p>geonear household latitudeofhousehold longitudeofhousehold using "College_Coodinates.dta", n(college latitudeofcollege longitudeofcollege)<br><br><br>However, I got the following error notification "latitudeofcollege or longitudeofcollege not constant within college group". I am not sure what this means and how I can fix it...</p>
Tue, 21 Dec 2021 12:36:59 GMT Katharina Meyer
https://warwick.ac.uk/fac/soc/economics/current/modules/ec331/forum/raeforum/?post=8a1785d87d765031017db7bcd30c6f77
Tue, 14 Dec 2021 06:59:05 GMT Christos Alexandrou