3/06/2026

Use LLM (e.g., ChatGPT) to learn *** and its impact

Research

Be careful

Courses and information 

I study the following code in the book Mastering Reinforcement Learning with Python by E. Bilgin:  

def first_visit_return(returns, trajectory, gamma):

G = 0

T = len(trajectory) - 1

for t, sar in enumerate(reversed(trajectory)):

s, a, r = sar

G = r + gamma * G

first_visit = True

for j in range(T - t):

if s == trajectory[j][0]:

first_visit = False

if first_visit:

if s in returns:

returns[s].append(G)

else:

returns[s] = [G]

return returns

I type in "Please comment the code" (請用繁中註解) and here is the (amazing) result.

沒有留言:

張貼留言