Top new questions this week:
|
BACKGROUND: The softmax function is the most common choice for an activation function for the last dense layer of a multiclass neural network classifier. The outputs of the softmax function have …
|
I have been reading some papers recently (example: arxiv.org/pdf/2012.00363.pdf) which seem to be training individual layers of, say, a transformer, holding the rest of the model frozen/…
|
Problem Description: Since I am not sure if there is a scientific term that categorizes this problem, I will do my best to describe it thoroughly. Suppose there is a chamber that’s being filled with …
|
Let’s say I have a trained deep learning model. It would be good if it would be numerically stable, so if I change input by small amount, the output will also change by small amount. How should I …
|
Hi i got the following roc curve: What does this mean? has this to do with overfitting? Is my data wrong preprocessed? i do not understand and would appreciate an answer.
|
I am reading the book Reinforcement Learning: An Introduction. Second edition (Richard S. Sutton and Andrew G. Barto). In the k-armed bandit problem using $\varepsilon$-greedy selection method, the …
|
What are some non-RL-based approaches to solving a typical bin assignment problem, i.e., given a set of items (can be multi-dimensional), find the bin/knapsack/target which best packs (with minimum …
|
Greatest hits from previous weeks:
|
I’m a bit confused about the definition of life. Can AI systems be called ‘living’? Because they can do most of the things that we can. They can even communicate with one another. They are not …
|
In mathematics, the word operator can refer to several distinct but related concepts. An operator can be defined as a function between two vector spaces, it can be defined as a function where the …
|
What are the differences between meta-learning and transfer learning? I have read 2 articles on Quora and TowardDataScience. Meta learning is a part of machine learning theory in which some …
|
Do scientists or research experts know from the kitchen what is happening inside complex “deep” neural network with at least millions of connections firing at an instant? Do they understand …
|
I am not looking for an efficient way to find primes (which of course is a solved problem). This is more of a “what if” question. So, in theory, could you train a neural network to predict …
|
What are “bottlenecks” in the context of neural networks? This term is mentioned, for example, in this TensorFlow article, which also uses the term “bottleneck values”. How does …
|
In the context of evolutionary computation, in particular genetic algorithms, there are two stochastic operations “mutation” and “crossover”. What are the differences between them?
|
Can you answer these questions?
|
In the context of RL, say I’m performing Value Iteration on a reward function R1. And the converged optimal policy is P1 and values are V1. Then, let’s say I set rewards to be R2=V1 and perform value …
|
I am reading an article of creating an embedding which is order-invariant to inputs ($m_{l}$). They refered to order-invariance in eq (5) as follows: The order-invariant operation happens in (5), …
|
I am reading the paper “Large Language Models Can Self-Improve” arxiv.org/abs/2210.11610 in which the authors consider that LLM can generate Chain-of-Thoughts sequences and even …
|