Top new questions this week:
|
Looking at the sklearn tfidf page: scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html and trying to understand the difference between term frequency …
|
If I understand correctly, the Key hidden layer in the Transformer XL is of size 2L * d, where …
|
I’m trying to create a model to predict the number of players on a video game at a certain time and was trying to figure out how to integrate categorical data into my forecasting problem. So far, my …
|
In this paper, the authors say that they used IO schema instead of BIO in their dataset, which, if I am not wrong, means they just tag the corresponding Entity Type or “O” in case the word …
|
I’m working on a project to predict bots from legit users from social medias. The data that I collected has about 5% of bots for 95% of legit users. The problem is as I labelled my data, I was more …
|
I was reading a blog post about improving machine-learning model train/validate/test splits. Towards the end was this remark: I say we should be more creative in the way we test machine learning …
|
Do people use n-grams or 1,2,3,…n-grams in both matrix factorisation and generative models in Topic Modeling? I’ve been trying to understand the basics of Topic Modeling and came to know that there …
|
Greatest hits from previous weeks:
|
I have a (2M, 23) dimensional numpy array X. It has a dtype of <U26, i.e. unicode string …
|
The model of linear regression is linear in parameters. What does this actually mean?
|
I am trying to perform k-means clustering on multiple columns. My data set is composed of 4 numerical columns and 1 categorical column. I already researched previous questions but the answers are not …
|
I was watching Machine Learning A- Z from SuperDataScience but when I was doing below code sample: …
|
I used to apply K-fold cross-validation for robust evaluation of my machine learning models. But I’m aware of the existence of the bootstrapping method for this purpose as well. However, I cannot see …
|
…
|
I have a pandas data frame with several entries, and I want to calculate the correlation between the income of some type of stores. There are a number of stores with income data, classification of …
|
Can you answer these questions?
|
I have been working on a supervised ML use case where dataset has Numerical (Price), Categorical(Category) and Textual data(Description) as features. Description feature has about 30% missing values. …
|
I don’t get why, for a classification task with missing values, With $n$ input variables, we can obtain all $2^n$ different classification functions needed for each possible set of missing inputs,. …
|
Consider a binary classification problem. Intuitively, a value for the area under the curve (for both curves) very close to 1, shows that the curve is almost L-shaped. Thus, this means that the value …
|